P Coleman
An audio-visual system for object-based audio : from recording to listening
Coleman, P; Franck, A; Francombe, J; Liu, Q; de Campos, T; Hughes, RJ; Menzies, D; Galvez, MFS; Tang, Y; Woodcock, JS; Jackson, PJB; Melchior, F; Pike, C; Fazi, FM; Cox, TJ; Hilton, A
Authors
A Franck
J Francombe
Q Liu
T de Campos
RJ Hughes
D Menzies
MFS Galvez
Y Tang
JS Woodcock
PJB Jackson
F Melchior
C Pike
FM Fazi
Prof Trevor Cox T.J.Cox@salford.ac.uk
Professor
A Hilton
Abstract
Object-based audio is an emerging representation for
audio content, where content is represented in a reproduction format-agnostic way and, thus, produced once for consumption on many different kinds of devices. This affords new opportunities for immersive, personalized, and interactive listening experiences. This paper introduces an end-to-end object-based spatial audio pipeline, from sound recording to listening. A high-level system architecture is proposed, which includes novel audiovisual interfaces to support object-based capture and listenertracked rendering, and incorporates a proposed component for objectification, that is, recording content directly into an object-based form. Text-based and extensible metadata enable communication between the system components. An open architecture for object rendering is also proposed. The system’s capabilities are evaluated in two parts. First, listener-tracked reproduction of metadata automatically estimated from two moving talkers is evaluated using an objective binaural localization model. Second, object-based scene capture with audio extracted using blind source separation (to remix between two talkers) and beamforming (to remix a recording of a jazz group) is evaluated
Citation
Coleman, P., Franck, A., Francombe, J., Liu, Q., de Campos, T., Hughes, R., …Hilton, A. (2018). An audio-visual system for object-based audio : from recording to listening. IEEE Transactions on Multimedia, 20(8), 1919-1931. https://doi.org/10.1109/TMM.2018.2794780
Journal Article Type | Article |
---|---|
Online Publication Date | Jan 17, 2018 |
Publication Date | Aug 1, 2018 |
Deposit Date | Dec 11, 2019 |
Publicly Available Date | Dec 11, 2019 |
Journal | IEEE Transactions on Multimedia |
Print ISSN | 1520-9210 |
Electronic ISSN | 1941-0077 |
Publisher | Institute of Electrical and Electronics Engineers |
Volume | 20 |
Issue | 8 |
Pages | 1919-1931 |
DOI | https://doi.org/10.1109/TMM.2018.2794780 |
Publisher URL | https://doi.org/10.1109/TMM.2018.2794780 |
Related Public URLs | https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6046 |
Files
Coleman_2018_IEEE_An_audio-visual_system_for_object-based_audio_from_recording_to_listening.pdf
(1.6 Mb)
PDF
Licence
http://creativecommons.org/licenses/by/3.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/3.0/
You might also like
The 2nd Clarity Prediction Challenge: A Machine Learning Challenge for Hearing Aid Intelligibility Prediction
(2024)
Conference Proceeding
Urban Hedges as Noise Barriers: Does Plant Species Choice Affect Insertion Loss?
(2024)
Journal Article
The First Cadenza Signal Processing Challenge: Improving Music for Those With a Hearing Loss
(2023)
Conference Proceeding
Applications of a hybrid method to a plate with simply supported boundary conditions
(2023)
Journal Article
Overview of the 2023 ICASSP SP Clarity Challenge: Speech Enhancement for Hearing Aids
(2023)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search