Cloud-based AI for automatic audio production for personalized immersive XR experiences

Oldfield, RG; Walley, MSS; Shirley, BG; Williams, DL

doi:10.5594/JMI.2022.3184849

Cloud-based AI for automatic audio production for personalized immersive XR experiences

Oldfield, RG; Walley, MSS; Shirley, BG; Williams, DL

Authors

RG Oldfield

MSS Walley

Dr Ben Shirley B.G.Shirley@salford.ac.uk
Associate Professor/Reader

DL Williams

Abstract

In this article, we focus on the machine-learning approach developed for automatic audio source recognition and mixing for the U.K. Government Department of Culture Media and Sport (DCMS) funded collaborative project called 5G Edge-XR. Leveraging graphics processing unit (GPU) acceleration, we deployed innovative algorithms in the cloud so that content can be automatically mixed on-the-fly for a personalized, immersive, and interactive experience for audiences. We describe the algorithms involved, the system architecture, how it has been implemented for immersive live boxing, and also how we are using it to enhance a live in-stadium experience.

Citation

Oldfield, R., Walley, M., Shirley, B., & Williams, D. (2022). Cloud-based AI for automatic audio production for personalized immersive XR experiences. SMPTE motion imaging journal, 131(7), 6-16. https://doi.org/10.5594/JMI.2022.3184849

Journal Article Type	Article
Acceptance Date	Jun 8, 2022
Publication Date	Aug 5, 2022
Deposit Date	Sep 28, 2022
Journal	SMPTE Motion Imaging Journal
Print ISSN	1545-0279
Electronic ISSN	2160-2492
Volume	131
Issue	7
Pages	6-16
DOI	https://doi.org/10.5594/JMI.2022.3184849
Publisher URL	https://doi.org/10.5594/JMI.2022.3184849
Additional Information	Corporate Creators : Salsa Sound Ltd, BT Applied Research Projects : 5G Edge-XR