Skip to main content

Research Repository

Advanced Search

Improving the measurement and acoustic performance of transparent face masks and shields (2022)
Journal Article
Cox, T. J., Dodgson, G., Harris, L., Perugia, E., Stone, M. A., & Walsh, M. (2022). Improving the measurement and acoustic performance of transparent face masks and shields. ˜The œJournal of the Acoustical Society of America (Online), 151(5), 2931-2944. https://doi.org/10.1121/10.0010384

Opaque face masks harm communication by preventing speech-reading (lip-reading) and attenuating high-frequency sound. Although transparent masks and shields (visors) with clear plastic inserts allow speech-reading, they usually create more sound atte... Read More about Improving the measurement and acoustic performance of transparent face masks and shields.

Dataset of British English speech recordings for psychoacoustics and speech processing research : the Clarity Speech Corpus (2022)
Journal Article
Graetzer, S., Akeroyd, M., Barker, J., Cox, T., Culling, J., Naylor, G., …Muñoz, R. (2022). Dataset of British English speech recordings for psychoacoustics and speech processing research : the Clarity Speech Corpus. Data in Brief, 41, 107951. https://doi.org/10.1016/j.dib.2022.107951

This paper presents the Clarity Speech Corpus, a publicly available, forty speaker British English speech dataset. The corpus was created for the purpose of running listening tests to gauge speech intelligibility and quality in the Clarity Project, w... Read More about Dataset of British English speech recordings for psychoacoustics and speech processing research : the Clarity Speech Corpus.

Clarity-2021 challenges : machine learning challenges for advancing hearing aid processing (2021)
Journal Article
Graetzer, S., Barker, J., Cox, T., Akeroyd, M., Culling, J., Naylor, G., …Viveros Munoz, R. (2021). Clarity-2021 challenges : machine learning challenges for advancing hearing aid processing. https://doi.org/10.21437/Interspeech.2021-1574

In recent years, rapid advances in speech technology have been made possible by machine learning challenges such as CHiME, REVERB, Blizzard, and Hurricane. In the Clarity project, the machine learning approach is applied to the problem of hearing aid... Read More about Clarity-2021 challenges : machine learning challenges for advancing hearing aid processing.

Using scale modelling to assess the prehistoric acoustics of stonehenge (2020)
Journal Article
Cox, T., Fazenda, B., & Greaney, S. (2020). Using scale modelling to assess the prehistoric acoustics of stonehenge. Journal of Archaeological Science, 122, 105218. https://doi.org/10.1016/j.jas.2020.105218

With social rituals usually involving sound, an archaeological understanding of a site requires the acoustics to be assessed. This paper demonstrates how this can be done with acoustic scale models. Scale modelling is an established method in archite... Read More about Using scale modelling to assess the prehistoric acoustics of stonehenge.

Fast speech intelligibility estimation using a neural network trained via distillation (2020)
Presentation / Conference
Cox, T., Bailey, W., & Tang, Y. (2020, January). Fast speech intelligibility estimation using a neural network trained via distillation. Poster presented at 12th Speech in Noise Workshop, Toulouse, France

Objective measures of speech intelligibility have many uses, including the evaluation of degradation during transmission and the development of processing algorithms. One intrusive approach is to use a method based on the audibility of speech glimpse... Read More about Fast speech intelligibility estimation using a neural network trained via distillation.

Personality and cognitive factors in the assessment of multimodal stimuli in immersive virtual environments (2019)
Thesis
Bailey, J. Personality and cognitive factors in the assessment of multimodal stimuli in immersive virtual environments. (Thesis). University of Salford

Literature in the study of human response to immersive virtual reality systems often deals with the phenomenon of presence. It can be shown that audio and imagery with spatial information can interact to affect presence in users of immersive virtual... Read More about Personality and cognitive factors in the assessment of multimodal stimuli in immersive virtual environments.

Pupil dilation reveals changes in listening effort due to energetic and informational masking (2019)
Presentation / Conference
Woodcock, J., Fazenda, B., Cox, T., & Davies, W. (2019, September). Pupil dilation reveals changes in listening effort due to energetic and informational masking. Presented at ICA 2019, Aachen, Germany

Pupil dilation has previously been shown to be a useful involuntary marker of listening effort. An inverse relationship between pupil diameter and signal to noise ratio has been shown when speech is energetically masked by noise. The work reported he... Read More about Pupil dilation reveals changes in listening effort due to energetic and informational masking.

Influence of visual stimuli on perceptual attributes of spatial audio (2019)
Journal Article
Woodcock, J., Davies, W., & Cox, T. (2019). Influence of visual stimuli on perceptual attributes of spatial audio. Journal of the Audio Engineering Society, 67(7/8), 557-567. https://doi.org/10.17743/jaes.2019.0019

Reproduced audio is often accompanied with visuals (i.e. television, virtual reality, gaming, and cinema). However, the audio technology for these systems is often researched and evaluated in isolation from the visual component. Previous research ind... Read More about Influence of visual stimuli on perceptual attributes of spatial audio.

Generalisation in environmental sound classification : the ‘making sense of sounds’ data set and challenge (2019)
Presentation / Conference
Kroos, C., Bones, O., Cao, Y., Harris, L., Jackson, P., Davies, W., …Plumbley, M. (2019, May). Generalisation in environmental sound classification : the ‘making sense of sounds’ data set and challenge. Presented at 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019), Brighton, UK

Humans are able to identify a large number of environmental sounds and categorise them according to high-level semantic categories, e.g. urban sounds or music. They are also capable of generalising from past experience to new sounds when applying the... Read More about Generalisation in environmental sound classification : the ‘making sense of sounds’ data set and challenge.

Background adaptation for improved listening experience in broadcasting (2019)
Presentation / Conference
Tang, Y., Cox, T., Fazenda, B., Liu, Q., & Wang, W. (2019, May). Background adaptation for improved listening experience in broadcasting. Presented at 44th International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, UK

The intelligibility of speech in noise can be improved by modifying the speech. But with object-based audio, there is the possibility of altering the background sound while leaving the speech unaltered. This may prove a less intrusive approach, affor... Read More about Background adaptation for improved listening experience in broadcasting.

The effects of classroom noise on the reading comprehension of adolescents (2019)
Journal Article
Connolly, D., Dockrell, J., Shield, B., Conetta, R., Mydlarz, C., & Cox, T. (2019). The effects of classroom noise on the reading comprehension of adolescents. ˜The œJournal of the Acoustical Society of America (Online), 145(1), 372-381. https://doi.org/10.1121/1.5087126

An investigation has been carried out to examine the impact of different levels of classroom noise on adolescents’ performance on reading and vocabulary-learning tasks. A total of 976 English high school pupils (564 aged 11 to 13 years and 412 aged 1... Read More about The effects of classroom noise on the reading comprehension of adolescents.

Perceptual audio evaluation of media device orchestration using the multi-stimulus ideal profile method (2018)
Presentation / Conference
Wilson, A., Cox, T., Zacharov, N., & Pike, C. (2018, October). Perceptual audio evaluation of media device orchestration using the multi-stimulus ideal profile method. Presented at Audio Engineering Society 145th Convention, New York, USA

The evaluation of object-based audio reproduction methods in a real-world context remains a challenge as it is difficult to separate the effects of the reproduction system from the effects of the audio mix rendered for that system. This is often comp... Read More about Perceptual audio evaluation of media device orchestration using the multi-stimulus ideal profile method.

Sound categories : category formation and evidence-based taxonomies (2018)
Journal Article
Bones, O., Cox, T., & Davies, W. (2018). Sound categories : category formation and evidence-based taxonomies. Frontiers in Psychology, 9, #1277. https://doi.org/10.3389/fpsyg.2018.01277

Five evidence-based taxonomies of everyday sounds frequently reported in the soundscape literature have been generated. An online sorting and category-labelling method that elicits rather than prescribes descriptive words was used. A total of N=242 p... Read More about Sound categories : category formation and evidence-based taxonomies.

Qualitative evaluation of media device orchestration for immersive spatial audio reproduction (2018)
Journal Article
Francombe, J., Woodcock, J., Hughes, R., Mason, R., Franck, A., Pike, C., …Hilton, A. (2018). Qualitative evaluation of media device orchestration for immersive spatial audio reproduction. Journal of the Audio Engineering Society, 66(6), 414-429. https://doi.org/10.17743/jaes.2018.0027

The challenge of installing and setting up dedicated spatial audio systems can make it difficult to deliver immersive listening experiences to the general public. However, the proliferation of smart mobile devices and the rise of the Internet of Thin... Read More about Qualitative evaluation of media device orchestration for immersive spatial audio reproduction.

Speech-to-screen : spatial separation of dialogue from noise towards improved speech intelligibility for the small screen (2018)
Presentation / Conference
Demonte, P., Tang, Y., Hughes, R., Cox, T., Fazenda, B., & Shirley, B. (2018, May). Speech-to-screen : spatial separation of dialogue from noise towards improved speech intelligibility for the small screen. Presented at 144th International Pro Audio Convention (AES Milan 2018), Milan, Italy

Can externalizing dialogue when in the presence of stereo background noise improve speech intelligibility? This has been investigated for audio over headphones using head-tracking in order to explore potential future developments for small-screen dev... Read More about Speech-to-screen : spatial separation of dialogue from noise towards improved speech intelligibility for the small screen.

Elicitation of expert knowledge to inform object-based audio rendering to different systems (2018)
Journal Article
rendering to different systems. Journal of the Audio Engineering Society, 66(1/2), 44-59. https://doi.org/10.17743/jaes.2018.0001

Object-based audio presents the opportunity to optimise audio reproduction for different listening scenarios. Vector base amplitude panning (VBAP) is typically used to render object-based scenes. Optimizing this process based on knowledge of the perc... Read More about Elicitation of expert knowledge to inform object-based audio rendering to different systems.

Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric (2018)
Journal Article
Tang, Y., Fazenda, B., & Cox, T. (2018). Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric. Applied Sciences, 8(1), 59. https://doi.org/10.3390/app8010059

While mixing, sound producers and audio professionals empirically set the speech-to-background ratio (SBR) based on rules of thumb and their own perception of sounds. There is no guarantee that the speech content will be intelligible for the general... Read More about Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric.

A non-intrusive method for estimating binaural speech intelligibility from noise-corrupted signals captured by a pair of microphones (2017)
Journal Article
Tang, Y., Liu, Q., Wang, W., & Cox, T. (2018). A non-intrusive method for estimating binaural speech intelligibility from noise-corrupted signals captured by a pair of microphones. Speech Communication, 96, 116-128. https://doi.org/10.1016/j.specom.2017.12.005

A non-intrusive method is introduced to predict binaural speech intelligibility in noise directly from signals captured using a pair of microphones. The approach combines signal processing techniques in blind source separation
and localisation, with... Read More about A non-intrusive method for estimating binaural speech intelligibility from noise-corrupted signals captured by a pair of microphones.