M Cooke
From macroscopic to microscopic glimpse-based models of intelligibility prediction
Cooke, M; Tang, Y; Toth, MA
Authors
Y Tang
MA Toth
Abstract
Miller and Licklider's explorations of the intelligibility of temporally interrupted speech, and later studies extending their findings to the spectro-temporal plane, have shown how the twin factors of sparseness and redundancy confer a high degree of robustness on speech in noise. The current contribution addresses two questions. First, to what extent can quantitative estimates of supra-threshold unmasked speech account for average (macroscopic) intelligiblity across a range of speech styles and masking conditions? We examine how well glimpse-based objective intelligibility metrics predict listeners' speech recognition scores for natural and synthetic speech in the presence of stationary and fluctuating maskers, and demonstrate reduced correlations for competing sources with an informational masking component. The second question concerns which additional components, beyond speech glimpses, are required to make (microscopic) predictions of actual listener confusions at the level of individual noisy speech tokens. Using corpora of speech-in-noise misperceptions, we show that in many cases the source of listener confusions is the misallocation of information from the masker, suggesting that estimates of supra-threshold unmasked speech alone are insufficient to explain speech intelligibility in noise.
Citation
Cooke, M., Tang, Y., & Toth, M. (in press). From macroscopic to microscopic glimpse-based models of intelligibility prediction. The Journal of the Acoustical Society of America (Online), 139(4), 2187-2187. https://doi.org/10.1121/1.4950509
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 8, 2016 |
Deposit Date | Jun 13, 2016 |
Journal | The Journal of the Acoustical Society of America (JASA) |
Print ISSN | 0001-4966 |
Electronic ISSN | 1520-8524 |
Volume | 139 |
Issue | 4 |
Pages | 2187-2187 |
DOI | https://doi.org/10.1121/1.4950509 |
Publisher URL | http://dx.doi.org/10.1121/1.4950509 |
Additional Information | Projects : INSPIRE: Investigating Speech Processing In Realistic Environments |
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search