AI-driven Soundscapes Design for Immersive Environments

Lead Research Organisation: University of York
Department Name: Electronics

Abstract

Brief description of the context of the research including potential impact:

Immersive experiences are a significant growth area for many areas including the creative industries, engineering design and healthcare. Developing immersive audio/visual content requires new skills in the capture and creation of assets, with a view to enabling user interaction and navigation of virtual spaces, and this is both a challenge and opportunity for those working with established methods in the creative screen industries. Content development workflows need to be redesigned, made more efficient, and facilitate the demands of 360-degree immersion and interaction - a step change from traditional 2-D screen-based approaches in film, TV, games and broadcast.

Machine learning - or machine listening - has been applied to sound and music production, automatic mixing and processing of audio signals in a number of research projects but little has been explored as yet in the context of sound design, nor in the case of immersive environments. This project will extend these methodologies to more creative contexts, suitable for radio, TV and film soundtracks, with a view to automating part of the sound design process for immersive, interactive and object-based audio content. Perceptual testing will be used to test the ecological validity of the result. These results will be used to train an AI-based approach to soundtrack design capable of dealing with the significant additional demands of open, non-linear, interactive and immersive sound worlds.

Aims and Objectives:

We will develop new audio production workflows for immersive and interactive object-based sound design scenarios.

- Review state-of-the-art technique for soundscape modelling, creation and perception, considering both the academic perspective and creative practice in immersive media.
- Develop novel techniques for soundscape generation/synthesis, incorporating machine learning for automation and allowing creative control of algorithms.
- Identify and apply objective measures to evaluate the quality of the resulting soundscape generation/synthesis.
- Use perceptual evaluation to evaluate synthesis techniques and user experience research to assess production tools built with these techniques.
- Consider the results in the context of production case studies.

The research methodology, including new knowledge or techniques in engineering and physical sciences that will be investigated:

The project will consider both research and development, as well as practice-based approaches, and user studies with colleagues from the BBC. It will further combine knowledge in sound, music and immersive audio technology, DSP, and programming skills for AI/Machine-Learning as particularly applied to sound design aesthetics, together with the design and perception of immersive experiences. This research will be underpinned with user-experience/HCI design criteria and make use of perceptual studies through audio and audio-visual subjective testing using panels of expert and novice subjects.

Alignment to EPSRC's Strategies and Research Areas:

This project fits the EPSRC ICT and Digital Economy themes working towards a more Connected Nation and in particular the Content Creation and Consumption priority area. There are also significant linkages with AI Technologies, DSP, HCI, Music and Acoustic Technology EPSRC Research Areas.

Companies or collaborators involved:

The project is part of the on-going relationship between the University of York and BBC R&D including the Audio Research Partnership and EPSRC/AHRC/InnovateUK funded Digital Creativity Labs. The PhD student will be supervised by Professor Damian Murphy and study in the Department of Electronic Engineering AudioLab at the University of York, spending short periods of time with BBC R&D leading to a full one-year placement in the third year of this research project.

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S513945/1 01/10/2018 30/09/2023
2109164 Studentship EP/S513945/1 01/10/2018 31/05/2023 Daniel Turner
 
Description 1) A novel dataset of spatial impulse responses was generated to aid the study of higher dimensional data extraction from lower dimensional representations of sound fields. This comes complete with tools for using the IRs to synthesise sound scenes using a variety of stereo microphone techniques and up to 4th order Ambisonics.

2) Investigation into the challenges faced by industry professionals when producing spatial audio for immersive experiences resulted in a range of novel findings, including the challenges around auditory distance emulation, lack of spatial audio sound effects libraries, and challenges around integrating legacy stereo content into spatial audio productions.

3) Development of an early stage methodology to extract audio panning data from a visual scene using vision based object detection and to use object classification to suggest candidate audio sound effects from a chosen repository. This proof of concept dealt with 2D video, however recent advances in the area of computer vision could see it applied to 360 degree video.
Exploitation Route Outcome (1) highlighted multiple areas for technology development and research that could be taken forward by both academia and industry for both the creative industry and technology companies focused on development for creative production tools.

Outcome (2) could be used by both academia and industry for synthesis of sound scenes to be used for spatial analysis algorithms and to train machine learning algorithms on a range of audio related tasks.

Outcome (3) could be extended to work with 360 degree video by either academia or industry and used to form the basis of new production tools for immersive audiovisual experiences. Candidate sound effect suggestion could also be taken forward and enhanced through the use of machine learning algorithms, especially given the surge in language model development.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software)