Y Tang
Glimpse-based metrics for predicting speech intelligibility in additive noise conditions
Tang, Y; Cooke, M
Authors
M Cooke
Abstract
The glimpsing model of speech perception in noise operates by recognising those speech-dominant spectro-temporal regions, or glimpses, that survive energetic masking; hence, a speech recognition component is an integral part of the model. The current study evaluates whether a simpler family of metrics based solely on quantifying the amount of supra-threshold target speech available after energetic masking can account for subjective intelligibility. The predictive power of glimpse-based metrics is compared for natural, processed and synthetic speech in the presence of stationary and fluctuating maskers. These metrics are raw glimpse proportion, extended glimpse proportion, and two further refinements: one, FMGP, incorporates a component simulating the effect of forward masking; the other, HEGP, selects speech-dominant spectro-temporal regions with above-average energy on the noisy speech. The metrics are compared alongside a state-of-the-art non-glimpsing metric, using three large datasets of listener scores. Both FMGP and HEGP equal or improve upon the predictive power of the raw and extended metrics, with across-masker correlations ranging from 0.81--0.92; both metrics equal or exceed the state-of-the-art metric in all conditions. These outcomes suggests that easily-computed measures of unmasked, supra-threshold speech can serve as robust proxies for intelligibility across a range of speech styles and additive masking conditions.
Citation
Tang, Y., & Cooke, M. (2016, September). Glimpse-based metrics for predicting speech intelligibility in additive noise conditions. Presented at 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, San Francisco, USA
Presentation Conference Type | Other |
---|---|
Conference Name | 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 |
Conference Location | San Francisco, USA |
Start Date | Sep 8, 2016 |
End Date | Sep 12, 2016 |
Deposit Date | Sep 9, 2016 |
Publicly Available Date | Apr 29, 2019 |
DOI | https://doi.org/10.21437/Interspeech.2016-14 |
Publisher URL | http://dx.doi.org/10.21437/Interspeech.2016-14 |
Related Public URLs | http://www.isca-speech.org/iscaweb/ |
Additional Information | Event Type : Conference Funders : European Commission Projects : The Listening Talker Grant Number: 256230 |
Files
0014.PDF
(171 Kb)
PDF
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search