Gerardo Roa-Dabike
The First Cadenza Challenges: Using Machine Learning Competitions to Improve Music for Listeners With a Hearing Loss
Roa-Dabike, Gerardo; Akeroyd, Michael A.; Bannister, Scott; Barker, Jon P.; Cox, Trevor J.; Fazenda, Bruno; Firth, Jennifer; Graetzer, Simone; Greasley, Alinka; Vos, Rebecca R.; Whitmer, William M.
Authors
Michael A. Akeroyd
Scott Bannister
Jon P. Barker
Prof Trevor Cox T.J.Cox@salford.ac.uk
Professor
Dr Bruno Fazenda B.M.Fazenda@salford.ac.uk
Associate Professor/Reader
Jennifer Firth
Dr Simone Graetzer S.N.Graetzer@salford.ac.uk
Research Fellow
Alinka Greasley
Dr Rebecca Vos Rebecca.Vos@salford.ac.uk
University Fellow
William M. Whitmer
Abstract
Listening to music can be an issue for those with a hearing impairment, and hearing aids are not a universal solution. This paper details the first use of an open challenge methodology to improve the audio quality of music for those with hearing loss through machine learning. The first challenge (CAD1) had 9 participants. The second was a 2024 ICASSP grand challenge (ICASSP24), which attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalized rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Objective evaluation used HAAQI, the Hearing-Aid Audio Quality Index. No entries improved on the best baseline in CAD1. It is suggested that this arose because demixing algorithms are relatively mature, and recent work has shown that access to large (private) datasets is needed to further improve performance. Learning from this, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specifying gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. Nine entrants scored better than the best ICASSP24 baseline. Most of the entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with freely available software and data.
Journal Article Type | Article |
---|---|
Acceptance Date | May 31, 2025 |
Online Publication Date | Jun 10, 2025 |
Publication Date | Jun 20, 2025 |
Deposit Date | Jul 10, 2025 |
Publicly Available Date | Jul 10, 2025 |
Journal | IEEE Open Journal of Signal Processing |
Peer Reviewed | Peer Reviewed |
Volume | 6 |
Pages | 722-734 |
DOI | https://doi.org/10.1109/ojsp.2025.3578299 |
Files
Published Version
(1.5 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Improving the measurement and acoustic performance of transparent face masks and shields
(2022)
Journal Article
Using scale modelling to assess the prehistoric acoustics of stonehenge
(2020)
Journal Article
Fast speech intelligibility estimation using a neural network trained via distillation
(2020)
Presentation / Conference
Pupil dilation reveals changes in listening effort due to energetic and informational masking
(2019)
Presentation / Conference
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search