Y Tang
Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric
Tang, Y; Fazenda, BM; Cox, TJ
Authors
Dr Bruno Fazenda B.M.Fazenda@salford.ac.uk
Associate Professor/Reader
Prof Trevor Cox T.J.Cox@salford.ac.uk
Professor
Abstract
While mixing, sound producers and audio professionals empirically set the speech-to-background ratio (SBR) based on rules of thumb and their own perception of sounds. There is no guarantee that the speech content will be intelligible for the general population consuming content over a wide variety of devices, however. In this study, an approach to automatically determine the appropriate SBR for a scene using an objective intelligibility metric is introduced. The model-estimated SBR needed for a preset minimum intelligibility level was compared to the listener-preferred SBR for a range of background sounds. It was found that an extra gain added to the model estimation is needed even for listeners with normal hearing. This gain is needed so an audio scene can be auditioned with comfort and without compromising the sound effects contributed by the background. When the background introduces little informational masking, the extra gain holds almost constant across the various background sounds. However, a larger gain is required for a background that induces informational masking, such as competing speech. The results from a final subjective rating study show that the model-estimated SBR with the additional gain, yields the same listening experience as the SBR preferred by listeners.
Citation
Tang, Y., Fazenda, B., & Cox, T. (2018). Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric. Applied Sciences, 8(1), 59. https://doi.org/10.3390/app8010059
Journal Article Type | Article |
---|---|
Acceptance Date | Dec 27, 2017 |
Publication Date | 2018-01 |
Deposit Date | Jan 3, 2018 |
Publicly Available Date | Jan 4, 2018 |
Journal | Applied Sciences |
Publisher | MDPI |
Volume | 8 |
Issue | 1 |
Pages | 59 |
DOI | https://doi.org/10.3390/app8010059 |
Publisher URL | http://dx.doi.org/10.3390/app8010059 |
Related Public URLs | http://www.mdpi.com/journal/applsci |
Files
applsci-08-00059.pdf
(396 Kb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
The First Cadenza Signal Processing Challenge: Improving Music for Those With a Hearing Loss
(2023)
Conference Proceeding
Spatial aspects of auditory salience
(2020)
Thesis
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search