Skip to main content

Research Repository

Advanced Search

Cadenza Challenge (CAD2): databases for lyric intelligibility task

Cox, Trevor; Roa Dabike, Gerardo

Authors

Gerardo Roa Dabike



Contributors

Gerardo Roa Dabike
Data Curator

William Whitmer
Data Curator

Alinka Greasley
Data Curator

Jennifer Firth
Data Curator

Scott Bannister
Data Curator

Jon Barker
Data Curator

Michael Akeroyd
Data Curator

Abstract

This is the training and validation data for the lyric intelligibility task from the Second Cadenza Machine Learning Challenge (CAD2).

The Cadenza Challenges are improving music production and processing for people with a hearing loss. According to The World Health Organization, 430 million people worldwide have a disabling hearing loss. Studies show that not being able to understand lyrics is an important problem to tackle for those with hearing loss. Consequently, this task is about improving the intelligibility of lyrics when listening to pop/rock over headphones. But this needs to be done without losing too much audio quality - you can't improve intelligibility just by turning off the rest of the band! We will be using one metric for intelligibility and another metric for audio quality, and giving you different targets to explore the balance between these metrics.

Please see the Cadenza website for a full description of the data

Online Publication Date Dec 9, 2024
Publication Date Dec 9, 2024
Deposit Date Jul 30, 2025
DOI https://doi.org/10.5281/zenodo.12685819
Collection Date Dec 9, 2025