📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Challenges in Immersive Audio Technology

Lead Research Organisation: King's College London
Department Name: Engineering

Abstract

Immersive technologies will transform not only how we communicate and experience entertainment, but also our experience of the physical world, from shops to museums, cars to classrooms. This transformation has been driven primarily by an unprecedented progress in visual technologies, which enable transporting users to an alternate visual reality. In the domain of audio, there are however long-standing fundamental challenges that need to be overcome to enable striking immersive experiences in which a group of listeners can just walk into a scene and feel transported to an alternate reality to enjoy a seamless shared experience without the need for headphones, head-tracking, personalisation or calibration.

The first key challenge is the delivery of immersive audio experiences to multiple listeners. Recent advances in audio technology are beginning to succeed in generating high quality immersive audio experiences. However, these are restricted in practice to individual listeners, with appropriate signals presented either via headphones, or via systems based on a modest number of loudspeakers using either cross-talk cancellation or beamforming. There remains a fundamental challenge in the technologically efficient delivery of "3D sound" to multiple listeners, either in small numbers (2-5) in a home environment, in museums, galleries and other public spaces (5-20) or in cinema and theatre auditoria (20-100). In principle, shared auditory experiences can be generated using physics-based methods such as wavefield synthesis or higher order ambisonics, but a sweet spot of even a modest size requires a prohibitive number of channels. CIAT aims to transform state of the art by developing a principled scalable and reconfigurable framework for capturing and reproducing only perceptually relevant information, thus leading to a step advance in the quality of immersive audio experiences achievable by practically viable systems.

The second key challenge is the real-time computation of environment acoustics needed to transport listeners to alternate reality, allowing them to interact with the environment and sound sources in it. This is pertinent to applications where immersive audio content is synthesised rather than recorded and to object-based audio in general. The sound field of an acoustic event consists of direct wavefront, followed by early and higher-order reflections. A convincing experience of being transported to the environment where the event takes place requires the rendering of these reflections, which cannot all be computed in real time. In applications where the sense of realism is critical, e.g. extended reality (XR) and to some extent gaming, impulse responses of the environment are typically computed only at several locations, with preset limits on the number reflections and directions of arrival, and then convolved with source sounds to achieve what is referred to as high-quality reverberation. Still, the computation of impulse responses and convolution may require GPU implementation and careful hands-on balancing between quality and complexity, and between CPU and GPU computation. CIAT aims to deliver a paradigm shift in environment modelling that will enable numerically efficient seamless high quality environment simulation in real time.

By addressing these challenges, CIAT will enable creation and delivery of shared interactive immersive audio experiences for emerging XR applications, whilst making a step advance in the quality of immersive audio in traditional media. In particular, efficient real-time synthesis of high quality environment acoustics is essential for both XR and object-based audio in general, including streaming and broadcasting. Delivery of 3D soundscapes to multiple listeners is a major unresolved problem in traditional applications too, including broadcasting, cinema, music events, and audio-visual installations.

Publications

10 25 50
 
Title 12 Hours at Rainy Days Festival 
Description A marathon immersive sound performance for voice and electronics. 
Type Of Art Performance (Music, Dance, Drama, etc) 
Year Produced 2024 
Impact The main impact is on creative practice in music composition enabled by the technologies developed on the project. 
URL https://whatsnew.composersedition.com/12hours-at-rainy-days-festival-interview/
 
Title ReepsOne at DAFx24 
Description Immersive interactive live performance. 
Type Of Art Performance (Music, Dance, Drama, etc) 
Year Produced 2024 
Impact The most notable impact is in the domain of artistic creativity enabled by the technologies developed on the project. 
URL https://dafx24.surrey.ac.uk/social-events/
 
Description King's College London Impact Acceleration Award
Amount £37,500 (GBP)
Organisation King's College London 
Sector Academic/University
Country United Kingdom
Start 03/2025 
End 10/2025
 
Description Institute of Sound Recording, University of Surrey 
Organisation University of Surrey
Country United Kingdom 
Sector Academic/University 
PI Contribution Expertise, intellectual input.
Collaborator Contribution Expertise, intellectual input.
Impact Joint publications, grant proposal, and further development and deployment of the audio technology developed with the relevant EPSRC project in art projects and installations.
Start Year 2016
 
Description METU 
Organisation Middle East Technical University
Department Institute of Marine Sciences
Country Turkey 
Sector Academic/University 
PI Contribution Expertise, intellectual.
Collaborator Contribution Expertise, intellectual.
Impact Joint publications. Development of the audio technology developed on the relevant EPSRC project and its deployment in art projects and installations.
Start Year 2012
 
Description National Gallery X 
Organisation National Gallery, London
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution My team has provided immersive audio technology and technical support.
Collaborator Contribution The National Gallery has provided equipment and funding residencies for sound artists to develop and test creative responses to a gallery interpretative challenge.
Impact Rain, Steam and Speed audio/visual performance. It is a multi-disciplinary collaboration involving music, visual arts, and audio-visual technologies.
Start Year 2019
 
Description Stanford 
Organisation Stanford University
Country United States 
Sector Academic/University 
PI Contribution Collaborative research.
Collaborator Contribution Collaborative research.
Impact A joint tutorial on multichannel surround systems, to be presented at ICASSP 2015. A joint paper to be submitted to AT&T Transactions on Audio, Speech, and Language Processing.
Start Year 2013
 
Description ReepsOne performance at DAFx24 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Musician ReepsOne created an immersive audio performance using the technology which is being developed on the project. The performance took place at DAFx24, Digital Audio Effects Conference 2024.
Year(s) Of Engagement Activity 2024