Robust speaker recognition in reverberant condition : toward greater biometric security

Al-Karawi, KA

Robust speaker recognition in reverberant condition : toward greater biometric security

Al-Karawi, KA

Authors

KA Al-Karawi

Contributors

FF Li F.F.Li@salford.ac.uk
Supervisor

Abstract

Automatic speaker recognition systems have developed into an increasingly relevant technology for security applications in modern times. The primary challenge for automatic speaker recognition is to deal with the variability of the environments and channels from where the speech was obtained. In previous work, good results have been achieved for clean, high-quality speech with the matching of training and test acoustic conditions. However, under mismatched conditions and reverberant environments, often expected in the real world, system performance degrades significantly.“ The main aim of this study is to improve the robustness of speaker recognition systems for real-world applications in reverberant conditions by developing methods that can reduce the detrimental effects of reverberation on the single microphone speech signal”.
The collection of suitable speech data sets is of crucial importance for testing the performance in the development of speaker recognition techniques. Therefore, a data set of anechoic speech recordings was generated and used to conduct the study regarding the suggested methods in this thesis. Furthermore, a typical speaker recognition system was implemented and then evaluated based on the current state of the art technique using Gaussian Mixture Models with two standard features. The effect of “reverberation time” and the “distance from the source to a receiver” on the system performance have also been examined, and the result confirms that whilst both parameters could affect the system accuracy.
A “maximum likelihood algorithm” is used for blind-estimate reverberation time from speech signals submitted for verification. The estimated values are used to choose a matched acoustic impulse response for inclusion in the retraining or fine-tuning of the pattern recognition model.
To endeavour more improvement, the “autocorrelation function” has been used to estimate the early reflections sound value for the submitted signal. The estimated early reflections sound value has convolved with the anechoic signal, and then used for training the pattern recognition model. Furthermore, both of the early to late ratio and RT have identified for the submitted sample and practically used to determine a matched channel for the training on the fly to improve the system performance.
The principal findings are that “reverberation time”, “early reflections” and “early to late ratio” can be estimated and then used with “training on the fly methods” to improve the speaker verification performance. The system is an improvement, which is demonstrated by comparing the performance of speaker recognition using “conventional methods” with the performance of the proposed “re-training method”.

Citation

Al-Karawi, K. (in press). Robust speaker recognition in reverberant condition : toward greater biometric security. (Thesis). University of Salford

Thesis Type	Thesis
Acceptance Date	Jun 8, 2018
Deposit Date	Sep 21, 2018
Publicly Available Date	Sep 21, 2018
Additional Information	Funders : The Ministry of Higher Education, Iraq
Award Date	May 29, 2018