S3A: Future Spatial Audio for an Immersive Listener Experience at Home
Lead Research Organisation:
University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP
Abstract
3D sound can offer listeners the experience of "being there" at a live event, such as the Proms or Olympic 100m, but
currently requires highly controlled listening spaces and loudspeaker setups. The goal of S3A is to realise practical
3D audio for the general public to enable immersive experiences at home or on the move.
Virtually the whole of the UK population consume audio. S3A aims to unlock the creative potential of 3D sound and deliver to listeners a step change in immersive experiences. This requires a radical new listener centred approach to audio enabling 3D sound production to dynamically adapt to the listeners' environment. Achieving immersive audio experiences in uncontrolled living spaces presents a significant research challenge. This requires major advances in our understanding of the perception of spatial audio together with new representations of audio and the signal processing that allows content creation and perceptually accurate reproduction. Existing audio production formats (stereo, 5.1) and those proposed for future cinema spatial audio (24,128) are channel-based requiring specific controlled loudspeaker arrangements that are simply not practical for the majority of home listeners. S3A will pioneer a novel object-based methodology for audio signal processing that allows flexible production and reproduction in real spaces. The reproduction will be adaptive to loudspeaker configuration, room acoustics and listener locations. The fields of audio and visual 3D scene understanding will be brought together to identify and model audio-visual objects in complex real scenes. Audio-visual objects are sound sources or events with known spatial properties of shape and location over time, e.g. a football being kicked, a musical instrument being played or the crowd chanting at a football match. Object based representation will transform audio production from existing channel based signal mixing (stereo, 5.1, 22.2) to spatial control of isolated sound sources and events. This will realise the creative potential of 3D sound enabling intelligent user-centred content production, transmission and reproduction of 3D audio content in platform independent formats. Object-based audio will allow flexible delivery (broadcast, IP and mobile) and adaptive reproduction of 3D sound to existing and new digital devices.
currently requires highly controlled listening spaces and loudspeaker setups. The goal of S3A is to realise practical
3D audio for the general public to enable immersive experiences at home or on the move.
Virtually the whole of the UK population consume audio. S3A aims to unlock the creative potential of 3D sound and deliver to listeners a step change in immersive experiences. This requires a radical new listener centred approach to audio enabling 3D sound production to dynamically adapt to the listeners' environment. Achieving immersive audio experiences in uncontrolled living spaces presents a significant research challenge. This requires major advances in our understanding of the perception of spatial audio together with new representations of audio and the signal processing that allows content creation and perceptually accurate reproduction. Existing audio production formats (stereo, 5.1) and those proposed for future cinema spatial audio (24,128) are channel-based requiring specific controlled loudspeaker arrangements that are simply not practical for the majority of home listeners. S3A will pioneer a novel object-based methodology for audio signal processing that allows flexible production and reproduction in real spaces. The reproduction will be adaptive to loudspeaker configuration, room acoustics and listener locations. The fields of audio and visual 3D scene understanding will be brought together to identify and model audio-visual objects in complex real scenes. Audio-visual objects are sound sources or events with known spatial properties of shape and location over time, e.g. a football being kicked, a musical instrument being played or the crowd chanting at a football match. Object based representation will transform audio production from existing channel based signal mixing (stereo, 5.1, 22.2) to spatial control of isolated sound sources and events. This will realise the creative potential of 3D sound enabling intelligent user-centred content production, transmission and reproduction of 3D audio content in platform independent formats. Object-based audio will allow flexible delivery (broadcast, IP and mobile) and adaptive reproduction of 3D sound to existing and new digital devices.
Planned Impact
Virtually the whole UK population are consumers of audio content. S3A will deliver to listeners a step change in the quality
of perceived sound, and provide new opportunities for UK creative industries to generate wealth through the artistic
exploitation of new audio-visual technology.
S3A's scientific and engineering advances will ensure that UK research remains at the forefront of spatial audio and pioneers new integrated audio-visual signal processing methodologies. This research will enable UK creative industries (broadcast, film, games, interactive media) to develop and exploit the best future spatial audio production and delivery technologies that add value to listeners' experience. The UK is a world-leader in audio-visual content production which is a growth sector contributing 12% (£120B) to the economy with over 2.5M employees. Consequently the UK is extremely well placed to exploit S3A research through creative/technology SME's(KEF,DTS,Orbitsound), TV (BBC), film (DNeg, Framestore, MPC), games (EA, Sony, Codemasters). S3A will enable UK creative industries to lead future technologies and standards for spatial audio and object-based audio-visual production.
Pathways to impact include:
(1) collaboration with the BBC to realise S3A technology in the next generation of spatial audio for broadcast and IP networks;
(2) working with games/film/web companies to address their requirements for spatial audio production & reproduction;
(3) leading international open standards for spatial audio through the BBC who are actively engaged in ISO/MPEG standards for audio and visual content;
(4) licensing of S3A technology to UK SME's for integration in mobile and home platforms;
(5) engagement of representative bodies to ensure needs of the hearing impaired are addressed;
(6) collaboration with creatives on showcasing the potential of spatial audio to deliver new listener experience;
(7) engaging the public in S3A research through spatial audio test broadcasts and web-based interactive media with the BBC
(8) public engagement with the science behind S3A spatial audio through pilots for feature documentaries on BBC TV/radio with involvement of co-investigator Prof.Trevor Cox who is a regular science commentator;
(9) open source tools for amateurs & professionals to edit/stream 3D sound to support active engagement in creative use of spatial audio;
(10) workshops to foster an audio-visual research community bringing together audio and visual researchers from both academia and industry.
of perceived sound, and provide new opportunities for UK creative industries to generate wealth through the artistic
exploitation of new audio-visual technology.
S3A's scientific and engineering advances will ensure that UK research remains at the forefront of spatial audio and pioneers new integrated audio-visual signal processing methodologies. This research will enable UK creative industries (broadcast, film, games, interactive media) to develop and exploit the best future spatial audio production and delivery technologies that add value to listeners' experience. The UK is a world-leader in audio-visual content production which is a growth sector contributing 12% (£120B) to the economy with over 2.5M employees. Consequently the UK is extremely well placed to exploit S3A research through creative/technology SME's(KEF,DTS,Orbitsound), TV (BBC), film (DNeg, Framestore, MPC), games (EA, Sony, Codemasters). S3A will enable UK creative industries to lead future technologies and standards for spatial audio and object-based audio-visual production.
Pathways to impact include:
(1) collaboration with the BBC to realise S3A technology in the next generation of spatial audio for broadcast and IP networks;
(2) working with games/film/web companies to address their requirements for spatial audio production & reproduction;
(3) leading international open standards for spatial audio through the BBC who are actively engaged in ISO/MPEG standards for audio and visual content;
(4) licensing of S3A technology to UK SME's for integration in mobile and home platforms;
(5) engagement of representative bodies to ensure needs of the hearing impaired are addressed;
(6) collaboration with creatives on showcasing the potential of spatial audio to deliver new listener experience;
(7) engaging the public in S3A research through spatial audio test broadcasts and web-based interactive media with the BBC
(8) public engagement with the science behind S3A spatial audio through pilots for feature documentaries on BBC TV/radio with involvement of co-investigator Prof.Trevor Cox who is a regular science commentator;
(9) open source tools for amateurs & professionals to edit/stream 3D sound to support active engagement in creative use of spatial audio;
(10) workshops to foster an audio-visual research community bringing together audio and visual researchers from both academia and industry.
Organisations
- University of Surrey (Lead Research Organisation)
- SONY (Collaboration)
- British Broadcasting Corporation (BBC) (Collaboration)
- Bang & Olufsen (Collaboration)
- Audioscenic (Collaboration)
- Xperi (United States) (Project Partner)
- Electronic Arts (United Kingdom) (Project Partner)
- KEF Audio (UK) Ltd (Project Partner)
- Bang & Olufsen (Denmark) (Project Partner)
- Reliance (United Kingdom) (Project Partner)
- British Broadcasting Corporation (United Kingdom) (Project Partner)
- Orbitsound Limited (Project Partner)
- Sony (United Kingdom) (Project Partner)
- Japan Broadcasting Corporation (Japan) (Project Partner)
Publications

Alinaghi A
(2014)
Joint Mixing Vector and Binaural Model Based Stereo Source Separation
in IEEE/ACM Transactions on Audio, Speech, and Language Processing

Baykaner K
(2015)
The Relationship Between Target Quality and Interference in Sound Zone
in Journal of the Audio Engineering Society

Berghi D
(2024)
Leveraging Visual Supervision for Array-Based Active Speaker Detection and Localization
in IEEE/ACM Transactions on Audio, Speech, and Language Processing


Chen J
(2021)
Channel and spatial attention based deep object co-segmentation
in Knowledge-Based Systems


Chourdakis E.T.
(2018)
Modelling experts' decisions on assigning narrative importances of objects in a radio drama mix
in Proceedings of the International Conference on Digital Audio Effects, DAFx

Coleman P
(2014)
Personal audio with a planar bright zone.
in The Journal of the Acoustical Society of America
Title | Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibility: wav audio files - background music |
Description | A zip folder with sub-folders containing .wav files of background music and speech-shaped noise (SSN) (control masking noise). As used with Tang & Cooke's (2016) HEGP OIM (high energetic glimpse proportion objective intelligibility metrics) and in a quantitative, subjective speech-in-noise test to investigate whether or not background music arrangement (in terms of timbre and music arrangement) and tempo have a significant effect on speech intelligibility. The investigation was conducted at the University of Salford in 2018 towards the PhD thesis by P. Demonte (2022). The speech-in-noise test used the original dialogue recording of the Revised Speech Perception In Noise test (RSPIN) spoken sentences (Kalikow, Stevens and Elliott, 1977; Bilger, 1984; Bilger et al., 1984) available on CD-r. Headphone playback of the dialogue was calibrated to an average level of 63 dB A. The master background music audio files were generated in Garage Band using Apple Loops. The control background noise - speech-shaped noise - a purely energetic masker used for comparison against music, was produced using white noise and samples of the spoken dialogue. The background music and speech-shaped noise audio files in this zip folder were set relative to the dialogue playback level to produce a glimpse proportion value for the dialogue of 10 (GP10), as per the output of Tang & Cooke's (2016) HEGP OIM within a Matlab script using an interative 'for' loop. That is to say, all the background masking noises were set to different speech-to-noise ratios, but to produce the same energetic masking level, such that any significant differences with regards to effect on speech intelligibility would be attributable to other factors. Playback of the dialogue and masking noise audio files was via an Adobe Audition digital audio work station. For an overview of the speech-to-noise ratios and glimpse proportions of each speech-noise .wav file pairing, see the Excel spreadsheet: https://doi.org/10.17866/rd.salford.19753936 - Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibility: Listening experiment settings (SNRs, GP, HEGP) spreadsheets. KEY Music - created in Garage Band using Apple Loops M1 (Apple Loop: Fireplace All): string quartet playing in a legato style; M2 (Apple Loop: Countdown Cello 01): solo cello playing a single note in a staccato, bowed style; M3 (Apple Loops: Countdown Cello 01; Laid Back Classic 01; African King Gyl 04; Big Maracas 03): cello, electric guitar, and lightly-percussive instrumentation; M4 (Apple Loops: Countdown Cello 01; Laid Back Classic 01; African King Gyl 04; Big Maracas 03; Lake Shift Bass; Barricade Arpeggio; High Octane Arpeggio; Altered State Beat 02): cello, electric guitar, and more heavily percussive instrumentation; M5_T0: speech-shaped noise (SSN); a purely energetic masking noise used as a control condition to compare any effects of the background music against. No defined tempo. Tempo T1: 60 beats per minute (BPM); T2: 100 bpm; T3: 140 bpm. GP10 refers to the arbitrary glimpse proportion (=10) of the spoken sentences relative to the background music or speech-shaped noise level. The audio file names in this zip folder also reflect: * RSPIN list number; * RSPIN sentence number; * the semantic level of the RSPIN sentence that corresponds to each masking noise file (HP = high predictability; LP = low predictability); * the target word of the RSPIN sentence that corresponds to each masking noise. -------------------------------------------------------------------------- For further details, contact: email (1): p.demonte@edu.salford.ac.uk email (2): philippademonte@gmail.com |
Type Of Art | Film/Video/Animation |
Year Produced | 2022 |
URL | https://salford.figshare.com/articles/media/Effect_of_Background_Music_Arrangement_and_Tempo_on_Fore... |
Title | Precedence Effect: Listening experiment images |
Description | A zip file containing three .png image files relating to a subjective speech-in-noise listening experiment conducted in the listening room at the University of Salford in March 2020. This experiment towards the PhD thesis by P. Demonte (2022) investigated whether or not the precedence effect could be utilised to significantly improve speech intelligibility for augmented loudspeaker arrays in the home, with future applications to media device orchestration with object-based audio. The experiment further explored binaural unmasking in terms of: i) binaural masking level difference (BMLD) and ii) the so-called better ear effect (BEE). The listening experiment involved three different loudspeaker array configurations: * L1 + R1 - a regular stereo configuration of two loudspeakers, with both simultaneously reproducing spoken dialogue and background noise; * L1 + R1 + C2 - a three-loudspeaker array, with an auxiliary loudspeaker (C2) in the true centre position (0 degrees azimuth) between the L1 + R1 stereo pair. C2 just plays spoken dialogue with a 10ms delay to invoke the precedence effect and provide a boost to the speech signal. Equalisation is also applied to the C2 signal to negate differences in comb filtering effects between the two- and three-loudspeaker array configurations; * L1 + R1 + R2 - a three-loudspeaker array, with the auxiliary loudspeaker (R2) at +90 degrees azimuth in order to test the better ear effect. As with L1 + R1 + C2, R2 just plays spoken dialogue with a 10ms delay, and equalisation is applied. The images in this zip file show: * MDO_Array_Listener.png - a photo showing the configuration of the four loudspeakers (for three different loudspeaker array configurations) and the seated listener position in the listening room for the experiment; * MDO_configuration.png - a figure showing the loudspeaker array positions (distances and azimuths from the listener position); * MDO_schematic.png - showing the differences between the two- and three-loudspeaker arrays in terms of boosts, delays, and equalisation applied. ------------------------------------------------------------------- For further information, contact: email (1): p.demonte@edu.salford.ac.uk email (2): philippademonte@gmail.com |
Type Of Art | Image |
Year Produced | 2022 |
URL | https://salford.figshare.com/articles/figure/Precedence_Effect_Listening_experiment_images/19766881 |
Title | The Turning Forest |
Description | S3A research produced the cutting edge object based audio radio drama which was then converted into the first immersive experience for the BBC - the VR The Turning Forest. This is a sound-based real-time CGI VR fairytale for people young and old--inviting audiences into a magical space of imagination, where rustling leaves of an autumn forest are also the footsteps of something familiar, yet strange. |
Type Of Art | Artefact (including digital) |
Year Produced | 2017 |
Impact | The work premiered in April at the Tribeca Film Festival Storyscapes Exhibition, which focuses on cutting edge artworks that explore new uses of media, highlighting innovation.It went to win the TVB european awards for best sound and was a finalist for BEST VR google daydream experience in May 2017 http://www.tvbeurope.com/tvbawards-2016-winners-announced/ |
URL | http://www.s3a-spatialaudio.org/wordpress/ |
Title | The Vostok-K Incident |
Description | The Vostok-K Incident, an S3A specially created science-fiction story, was designed to specifically take advantage of additional connected devices available to users. The S3A researchers used a technology called "object-based media" to flexibly reproduce audio, regardless of what devices people connect to or how these are arranged. In fact, the more devices that are connected, the more immersion the listener experiences by unlocking surround sound effects as well as extra hidden content. The Vostok-K Incident. It's 13 minutes long and was created specifically to take advantage of extra connected devices to tell a story. |
Type Of Art | Artefact (including digital) |
Year Produced | 2018 |
Impact | The Vostok-K Incident was launched at the British Science Festival in November 2018. This idea with a new science-fiction drama is part of the BBC taster online. |
URL | http://www.s3a-spatialaudio.org/vostok-k |
Description | S3A is pioneering methods for creating immersive spatial audio experiences for the listener at home or on the move. Research is investigating all aspects of the production from recording and editing through to delivery and practical reproduction at home. S3A has delivered advances in the following areas: - understanding and modelling listener perception of spatial audio in real spaces - perceptual metering of spatial audio - end-to-end production of spatial audio from recording to reproduction - object-based audio recording, editing and manipulation - object-based spatial audio reproduction - listenter centered reproduction - room modelling and adaption in spatial audio reproduction - audio-visual localisation of sound sources - source separation for multiple object sources - perceptual modelling of intelligibility of sound sources - methods to control and improve intelligibility of content - audio-visual room modelling - creation of new spatial audio experiences for listeners at home - personalised listening experiences to improve accessibility of content based on narrative importance and listener perception of the content - use of trans aural speaker arrays to create independent listening experience for multiple listeners in the same environment These technologies have been integrated to demonstrate enhanced and new listening experiences. Technologies developed in S3A are contributing to international standards and new consumer technologies. S3A has exceeded the original objectives by introducing and spinning out new technologies for personalised immersive audio experiences including a commercial sound-bar technology which creates the experience of virtual headphones, a new generation of audio experience using media device orchestration, VR spatial audio experience using room adaptation, and the first open source tools for object-based spatial audio production. |
Exploitation Route | S3A has contributed new technologies for creation of immersive audio and audio-visual content which can be experienced by the listener at home. Further exploitation is expected through: - commercial exploitation novel methods for production of audio and audio-visual content in the creative industries (TV, film, games, internet) - novel methods for listeners to experience spatial audio a home (consumer electronics, TV, film, games) - novel devices for audio and visual content - technologies for perceptual metering of audio in production and reproduction - new creative tools and media experiences - the first open-source tools for object based spatial audio production - media device orchestration enabling immersive experiences without specialist spatial audio production technology with public demonstration eg. Vostok-K - commercialisation of sound bar technology by spinout AudioScenic to create the experience of virtual headphones - personalised immersive spatial audio experiences - award winning content creation of immersive spatial audio experience eg The Turning Forest available on GooglePlay/Occulus VR, listed as a top 20 VR experience 2016-20 - award winning broadcast personalised TV content to improve accessibility by exploiting narrative importance eg. BBC Casualty |
Sectors | Communities and Social Services/Policy Creative Economy Digital/Communication/Information Technologies (including Software) Education Healthcare Leisure Activities including Sports Recreation and Tourism Culture Heritage Museums and Collections |
URL | http://www.s3a-spatialaudio.org |
Description | Listener centred spatial audio reproduction for immersive spatial audio experience at home and improve content accessibility for the hearing impaired. IP protection, licensing and commercialisation of sound bar technology for personalised spatial audio reproduction of 'virtual headphones' for multiple listeners with personalised content for each listener. This technology has been commercialised by spinout AudioScenic. The first open-source tools for object-based spatial audio production enabling use in the creative industries. Content creation to demonstrate the potential of object-based spatial audio production of personalised immersive experiences. Award winning experiences include The Turning Forest VR, Vostok-K, Casualty and other content released via the BBC Taster and public platforms. Media device orchestration demonstrating the capability of practical immersive experience production across an ad-hoc array of devices demonstrated in the Vostok-K and other MDO experiences. This has resulted in follow on commercial development to explore the creative potential of MDO. S3A established the foundations for an EPSRC/BBC Prosperity Partnership 'AI4ME - Personalised Object-based Media Experiences for All', a 5 year collaboration led by the BBC and University of Surrey in collaboration with University of Lancaster and 15 leading companies across the UK media industry. AI4ME builds directly on the pioneering research in object-based delivered by S3A and will deliver a new generation of media experiences which are personalised to individual interest, accessibility requirements, device and location. |
First Year Of Impact | 2013 |
Sector | Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare,Culture, Heritage, Museums and Collections |
Impact Types | Cultural Societal Economic |
Description | Ofcom Object-based Media Working Group |
Geographic Reach | National |
Policy Influence Type | Participation in a guidance/advisory committee |
Impact | Influence on media communication and service regulation |
URL | https://www.ofcom.org.uk |
Description | Audio-Visual Media Research Platform |
Amount | £1,577,223 (GBP) |
Funding ID | EP/P022529/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 07/2017 |
End | 07/2023 |
Description | BBC Prosperity Partnership: Future Personalised Object-based Media Experiences Delivered at Scale Anywhere |
Amount | £8,500,000 (GBP) |
Funding ID | EP/V038087/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 06/2021 |
End | 06/2026 |
Description | EPSRC I-case studentship |
Amount | £107,560 (GBP) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2015 |
End | 09/2020 |
Description | EPSRC i-case studentship |
Amount | £108,580 (GBP) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2016 |
End | 09/2021 |
Description | ICURE - Support for Junior Researcher, Dr J Francombe for 3 months, able to claim up to £35,000 of travel and expenditure, to carry out market validation of research-based business ideas and to receive intensive support in developing those. |
Amount | £35,000 (GBP) |
Organisation | SETsquared Partnership |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2016 |
End | 07/2016 |
Description | Polymersive: Immersive Video Production Tools for Studio and Live Events |
Amount | £726,251 (GBP) |
Funding ID | 105168 |
Organisation | Innovate UK |
Sector | Public |
Country | United Kingdom |
Start | 03/2019 |
End | 09/2020 |
Description | UKRI AI Centre for Doctoral Training in AI for Digital Media Inclusion |
Amount | £12,959,982 (GBP) |
Funding ID | EP/Y030915/1 |
Organisation | Royal Holloway, University of London |
Sector | Academic/University |
Country | United Kingdom |
Start | 03/2024 |
End | 09/2032 |
Title | BST - Binaural Synthesis Toolkit |
Description | The Binaural Synthesis Toolkit is a modular and open-source package for binaural synthesis, i.e., spatial audio reproduction over headphones or transaural loudspeaker systems. It supports different reproduction methods (dynamic HRIR synthesis, HOA-based rendering, and BRIR-based virtual loudspeaker rendering), dynamic head tracking, and current data formats as SOFA. It is based on the VISR framework and implemented in Python, which means that the code is relatively accessible and open to adaptations and extensions, as well as being reusable in larger audio processing algorithms. The BST is provided to foster reproducible audio research. It is mainly targeted at researchers in sound reproduction and perception, but it could be used by enthusiasts as well. |
Type Of Material | Computer model/algorithm |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | * AES convention paper: Andreas Franck, Giacomo Costantini, Chris Pike, and Filippo Maria Fazi, "An Open Realtime Binaural Synthesis Toolkit for Audio Research," in Proc. Audio Eng. Soc. 144th Conv., Milano, Italy, 2018, Engineering Brief. |
URL | http://cvssp.org/data/s3a/public/ |
Title | Compensated Stereo Panning Perceptual Test Data |
Description | A Low Frequency Panning Method with Compensation for Head Rotation. IEEE/ACM Transactions on Audio, Speech, and Language Processing. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | Unprocessed results of listening tests for Compensated Amplitude Panning Reproduction of spatial audio. Details in 'A Low Frequency Panning Method with Compensation for Head Rotation' |
URL | https://eprints.soton.ac.uk/415757/ |
Title | Data for 'Evaluation of Spatial Audio Reproduction Methods (Part 2): Analysis of Listener Preference' |
Description | Data accompanying the paper "Evaluation of Spatial Audi Reproduction Methods (Part2): Analysis of Listener Preference. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | https://openresearch.surrey.ac.uk/esploro/outputs/dataset/99515134302346 |
Title | Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibility: Listening experiment settings (SNRs, GP, HEGP) spreadsheets. |
Description | Excel spreadsheeting containing data collected and collated from objective and subjective testing of whether or not background music arrangement (timbre and instrumentation density) and tempo have any significant effect on foreground speech intelligibility. The values of the objective data - speech-to-noise ratios (dB SNR), glimpse proportions (GP), and high energy glimpse proportions (HEGP) - were generated and collected in a Matlab script that incorporated Tang & Cooke's (2016) HEGP OIM (high energy glimpse proportion objective intelligibility metric) together with an interative 'for' loop. The subjective data were collected in a standard speech-in-noise test (SINT), in which participants listened via headphones to speech played simultaneously with either background music or a control masking noise, and were tasked with identifying the final word of each spoken sentence (target word). The listening experiment used the RSPIN speech corpus. Background music stimuli were generated by the researcher using Apple Loops in Garage Band. 'Read Me' page provides: a brief overview of the listening experiment; citation and link for Tang and Cooke's (2016) HEGP OIM; key to explain the shorthand of the independent variables and file names, and an overview of the other spreadsheets. 'Various_GP' is an overview of equivalent speech-to-noise ratios (dB SNR) determined for three different glimpse proportion (GP) values using the speech and music masker / masking noise pairs in the Matlab script. These objective values were generated to determine which target glimpse proportion to set all the masking noise files to for the subjective listening experiment. 'GP10_SNRs' shows two tables: one with the GP values that each masking noise file was set to and the corresponding SNRs; the other table shows this information summarised across 300 speech-noise audio file pairs. 'Results' shows the raw subjective listening experiment data collected, collated, and sorted by participant ID number, RSPIN list and RSPIN sentence number. This table has pulled in the relevant speech-to-noise ratio, glimpse proportion, and high energy glimpse proportion value from the previous page. 'Summaries' shows tables of the data collated in different ways for the purpose of generating box and whisker plots and conducting statistical analyses. Each table is a summary by participant ID (rows) and the speech-background music / masking noise combination of independent variables: total number of trials; summed correct word scores; mean correct word recognition percentages; mean speech-to-noise ratios (dB SNR); mean glimpse proportions (GP), and mean high energy glimpse proportions (HEGP) ------------------------------------------------------------------- For further details, see PhD thesis by P. Demonte (2022), or contact: email (1): p.demonte@edu.salford.ac.uk email (2): philippademonte@gmail.com See also the Excel spreadsheet with the listening experiment data and statistical analyses: https://doi.org/10.17866/rd.salford.19745815 'Effect of Background Music Arrangement and Tempo on Foreground Speech Intelligibiltiy: Listening experiment data'. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://salford.figshare.com/articles/dataset/Effect_of_Background_Music_Arrangement_and_Tempo_on_Fo... |
Title | Precedence Effect: Listening experiment data and statistical analyses spreadsheets |
Description | Excel spreadsheet with the data collected from a subjective, quantitative speech in noise test (SINT) conducted in the Listening Room at the University of Salford in March 2020. The listening experiment tested how the psychoacoustic phenomenon of the precedence effect can be utilised with augmented loudspeaker arrays in an object-based audio paradigm to improve speech intelligibility in the home environment. A practical application of this research will be in the implementation of media device orchestration, i.e. the creation of low-cost, ad-hoc loud speaker arrays using commonly found devices, such as mobile phones, laptop computers, tablets, smart speakers, and so on, to spatialise audio in the home. This speech-in-noise test was conducted under controlled conditions. With audio reproduced by one of three different arrays of loudspeakers in a given trial, subjects listened to spoken sentences played simultaneously with noise. They were tasked with correctly identifying target words. Correct word scores collated and converted to word recognition percentages act as a quantifiable proxy for speech intelligibility. After confirming that they fulfilled the criterion for use, data were statistically analysed using 2-way RMANOVA. The three configurations of loudspeaker arrays were: * L1R1_base (a two-loudspeaker control condition): a stereo pair of front left and front right loudspeakers at -/+30 degrees azimuth and 2m distance from the listener position; speech + noise reproduced by both loudspeakers. * L1R1C2 (three loudspeakers): L1R1_base + an additional (AUX) loudspeaker in the true front centre position (0 degrees azimuth and 1.7m distance from listener position) reproducing just speech. * L1R1R2 (three loudspeakers): L1R1_base + an AUX loudspeaker in the right-hand position (+90 degrees azimuth and 1.7m distance from listener position) reproducing just speech. For the array configurations with the three loudspeakers, the precedence effect was initiated by applying a 10 ms delay to the speech signal reproduced by the AUX loudspeaker, such that the sound source (first arrivals) would still be perceived as being from the phantom centre between the L1 and R1 loudspeakers, but with a boost to the speech signal. The relevant equalisation (EQ) was applied to the speech signal for the C2 and R2 AUX loudspeakers though to maintain the same perceived comb filtering effects for all three loudspeaker array configurations. Analysis of the results is provided in the PhD thesis by P. Demonte. ----------------------------------------------------------------------- Spreadsheet pages: * Read Me - provides a more in-depth explanation of the independent variables tested * Raw data - as collected in the speech-in-noise test. The columns denote: subject number; trial number; audio files playing from each loudspeaker in a trial; loudspeaker array configuration; masking noise type; Harvard speech corpus list and sentence number; spoken sentence played; the five target words in each sentence; the sentence as heard and noted by the subject; correct word score applied (out of a total of 5 per trial); correct word ratio. * CWR_all - correct word percentages collated for each subject for each combination of independent variables, and the corresponding studentized residuals as a quality check for outliers. * NormalDistTest - criteria for normal distribution (Shapiro-Wilk test) * 2-way RMANOVA_16subjects - Mauchley's test of Sphericity, and Tests of Wtihin-Subjects Effects (2-way RMANOVA) * SimpleMainEffects - analysis of the conditional effects * Participants_MainTest - anonymised data collated from the subjects via a short pre-screening questionnaire: age; gender, handedness (left or right); confirmation of subjects as native English speakers, and whether or not they are bi-/multilingual in case of outliers. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://salford.figshare.com/articles/dataset/Precedence_Effect_Listening_experiment_data_and_statis... |
Title | Room Impulse Responses (RIRs) and Visualisation |
Description | RIR datasets captured as part of the S3A project and supplementary material. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Permission is granted to use the S3A Room Impulse Response dataset for academic purposes only, provided that it is suitably referenced in publications related to its use |
URL | http://cvssp.org/data/s3a/ |
Title | S3A radio drama scenes |
Description | Data created for the S3A Radio Drama |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | The Radio Drama was used for the creation of the VR The Turning Forest - award winning BBC's first VR production |
URL | http://cvssp.org/data/s3a/ |
Title | S3A speaker tracking with Kinect2 |
Description | Person tracking using audio and depth cues Identity association using PHD filters in multiple head tracking with depth sensors |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Permission is granted to use the S3A Room Impulse Response dataset for academic purposes only, provided that it is suitably referenced in publications related to its use |
URL | http://cvssp.org/data/s3a/ |
Title | S3A speaker tracking with Kinect2 |
Description | Person tracking using audio and depth cues Identity association using PHD filters in multiple head tracking with depth sensors |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | Datasets - Open access |
URL | https://www.s3a-spatialaudio.org/datasets |
Title | Speech-To-Screen: Listening Experiment Data and Statistical Analyses spreadsheets |
Description | An Excel spreadsheet related to the Speech-To-Screen listening experiment conducted in the Listening Room at the University of Salford in 2017 as part of the EPSRC-funded S3A Future Spatial Audio at Home project. The aim of the experiment was to test the effect on speech intelligibility of different binaural auralisations of speech and noise related to headphone playback with small-screen devices. The experiment involved a speech-in-noise test, whereby subjects had to identify target words, in this case - letter-number pairs - in spoken sentences played simultaneously with either speech-shaped noise (SSN) or speech-modulated noise (SMN). Colated correct word scores converted to word recognition percentages then acted as a proxy for quantifying speech intelligibility for the different conditions tested. Spreadsheet includes: * Read Me page - including an overview of the independent variables; * Raw data collected: letter-number combinations entered by subjects into a graphical user interface for each trial; * Correct word scores for each trial; * Scores summed by subject and combination of conditions, then converted to ratios and percentages; * Criterion checks for use of 3-way RMANOVA; * Statistical analyses using 3-way RMANOVA and post-hoc pairwise comparisons; * Further analyses, including quantified intelligibility of the 16 different speakers' content from the GRID speech corpus used. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
URL | https://salford.figshare.com/articles/dataset/Speech-To-Screen_Listening_Experiment_Data_and_Statist... |
Title | VISR - Versatile Interactive Scene Renderer |
Description | The VISR framework is a general-purpose software framework for general-purpose audio processing that is well-suited for multi-channel, spatial and object-based audio. It is an extensible, modular, and portable software framework this is currently being released under an open-source licence. Target audiences are researchers in audio and related disciplines, e.g. audiology. The VISR differs from existing software products in several ways: Firstly, it is well-suited for integration into other software environments, e.g., digital audio workstations or graphical programming languages as Max/MSP, in order to make functionality implemented in VISR available to a wider group of researchers, creatives, or enthusiasts. Secondly, a thorough integration of the Python language enables easy prototyping and adaptation of audio processing systems and makes it accessible to a wider group of users. Thirdly, it enables algorithm design in traditional environments as Matlab or Python and their realtime implementation using the same code base, which has the potential to streamline the workflow of many audio research and development tasks. |
Type Of Material | Computer model/algorithm |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | * Used for the rendering and for the subjective of most achievements within the S3A project (including radio drama scenes and the Media Device Orchestration (MDO) technology). * Used as the DSP and development platform of the Transaural sound bar technology (S3A ans Soton Audio Labs). * Used in art installations (e.g., The Trembling Line), science fairs, open days. * Forms technical basis for collaboration between BBC R&D and S3A/University of Southampton |
URL | http://cvssp.org/data/s3a/public/VISR |
Description | AudioScenic |
Organisation | Audioscenic |
Country | United Kingdom |
Sector | Private |
PI Contribution | Advisor/collaboration on spatial audio research and development for personalised media |
Collaborator Contribution | Participation in industry events |
Impact | Research advice in spatial audio |
Start Year | 2021 |
Description | BBC Research and Development |
Organisation | British Broadcasting Corporation (BBC) |
Country | United Kingdom |
Sector | Public |
PI Contribution | Research in Computer Vision for broadcast production and Audio. Technologies for 3D production, free-view point video in sports, stereo production from monocular cameras, video annotation Member of the BBC Audio Research Partnership - developing the next generation of broadcast technology. |
Collaborator Contribution | In kind contribution (members of Steering/Advisory Boards) Use of the BBC lab and research/development facilities. Studentships (industrial case) funding and co-supervision of PhD students. |
Impact | Multi-disciplinary collaboration involves Computer Vision, Video Analysis, Psychoacoustics, Signal Processing and Spatial Audio |
Description | Bang and Olufsen |
Organisation | Bang & Olufsen |
Country | Denmark |
Sector | Private |
PI Contribution | Spatial audio research (POSZ and S3A EPSRC funded projects) |
Collaborator Contribution | Scholarships (fees and bursaries) for EU/Home students. In-kind contribution by members of B&O Research department (Soren Bech, member of Steering /Advisory Boards and co-supervisor of funded students). Use of research facilities at their labs in Denmark |
Impact | Publications listed on http://iosr.uk/projects/POSZ/ Multi-disciplinary Collaboration: Signal Processing, Psychoacoustics and Spatial audio |
Description | Sony Broadcast and Professional Europe |
Organisation | SONY |
Department | Sony Broadcast and Professional Europe |
Country | United Kingdom |
Sector | Private |
Start Year | 2004 |
Title | BST - Binaural Synthesis Toolkit |
Description | The Binaural Synthesis Toolkit is a modular and open-source package for binaural synthesis, i.e., spatial audio reproduction over headphones or transaural loudspeaker systems. It supports different reproduction methods (dynamic HRIR synthesis, HOA-based rendering, and BRIR-based virtual loudspeaker rendering), dynamic head tracking, and current data formats as SOFA. It is based on the VISR framework and implemented in Python, which means that the code is relatively accessible and open to adaptations and extensions, as well as being reusable in larger audio processing algorithms. The BST is provided to foster reproducible audio research. It is mainly targeted at researchers in sound reproduction and perception, but it could be used by enthusiasts as well. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | * AES convention paper: Andreas Franck, Giacomo Costantini, Chris Pike, and Filippo Maria Fazi, "An Open Realtime Binaural Synthesis Toolkit for Audio Research," in Proc. Audio Eng. Soc. 144th Conv., Milano, Italy, 2018, Engineering Brief. |
URL | http://cvssp.org/data/s3a/public/ |
Title | VISR - Versatile Interactive Scene Renderer |
Description | The VISR framework is a general-purpose software framework for general-purpose audio processing that is well-suited for multi-channel, spatial and object-based audio. It is an extensible, modular, and portable software framework this is currently being released under an open-source licence. Target audiences are researchers in audio and related disciplines, e.g. audiology. The VISR differs from existing software products in several ways: Firstly, it is well-suited for integration into other software environments, e.g., digital audio workstations or graphical programming languages as Max/MSP, in order to make functionality implemented in VISR available to a wider group of researchers, creatives, or enthusiasts. Secondly, a thorough integration of the Python language enables easy prototyping and adaptation of audio processing systems and makes it accessible to a wider group of users. Thirdly, it enables algorithm design in traditional environments as Matlab or Python and their realtime implementation using the same code base, which has the potential to streamline the workflow of many audio research and development tasks. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | * Used for the rendering and for the subjective of most achievements within the S3A project (including radio drama scenes and the Media Device Orchestration (MDO) technology). * Used as the DSP and development platform of the Transaural sound bar technology (S3A ans Soton Audio Labs). * Used in art installations (e.g., The Trembling Line), science fairs, open days. * Forms technical basis for collaboration between BBC R&D and S3A/University of Southampton |
URL | http://cvssp.org/data/s3a/public/VISR |
Company Name | Audioscenic |
Description | Audioscenic develops 3D loudspeaker technology. |
Year Established | 2017 |
Impact | The company is classified as "information technology consultancy activities" (SIC: 62020), "business and domestic software development" (Standard Industrial Classification code: 62012), "manufacture of consumer electronics" (Standard Industrial Classification code: 26400). There was a change of name on 2018-11-08 and their previous name was Soton Audio Labs Limited. |
Website | http://www.audioscenic.com |
Description | 3rd UK - Korea Focal Point Workshop conjunction with ACM Multimedia 2018 (22-27 October 2018) British Embassy Seoul, UK Science & Innovation Network - Intelligent Virtual Reality: Deep Audio-Visual Representation Learning for Multimedia Perception and Reproduction |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The workshop was a good opportunity to bring together leading experts in audio processing and computer vision, and bridge the gap between two research fields in multimedia content production and reproduction. It was not only between the UK and Korean research group but accommodated various group from the world through the international conference. - The soundbar demo was given to the 4DPlex Innovation team, to the R&D team, and to part of their executive team. 4DPlex was impressed by the quality of the demo, and suggested to stay in touch for further conversations about the commercial development of the technology in a cinema application. - After the demonstration and meeting and ProGate, ProGate and Soton Audio Labs at Southampton are going to sign a contract in which Progate will act as sales representative of Soton Audio Labs to liaise in front of Korean consumer electronics companies. - Joint publication: Changjae Oh, Bumsub Ham, Hansung Kim, Adrian Hilton and Kwanghoon Sohn, "OCEAN: Object-Centric Arranging Network for Self-supervised Visual Representations Learning," Expert Systems With Applications, Submitted in June 2018 Potential Applications of Collaboration Results - Loudspeaker arrays for cinema surround sound system on the 4D cinema system by CJ 4DX - Listener adaptive laptop loudspeaker array system for game industry - VR system with immersive 3D visual contents and spatial audio with KIST |
Year(s) Of Engagement Activity | 2018 |
Description | AES - Good vibrations bringing Radio Drama to life - Eloise Whitmore |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | AES in Cambridge on Wed 16th Dec. Talk by Eloise Whitmore about radio drama and S3A cutting edge production methods such as object based audio and 3D sound design. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.aes-uk.org/forthcoming-meetings/good-vibrations-bringing-radio-drama-to-life/ |
Description | AES Convention Berlin 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Various papers presented incl. "Acoustic Room Modelling Using a Spherical Camera for Reverberant Spatial Audio Objects". During the poster sessions there was also opportunity for networking and interaction with other practitioners to spread the word about the technologies being developed by S3A. Paper accesible at http://epubs.surrey.ac.uk/id/eprint/813849 |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.aes.org/events/142/ |
Description | Audio Mostly 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Audio Mostly is an interdisciplinary conference on design and experience of interaction with sound t embracing applied theory and reflective practice. Participants bring together thinkers and doers from academia and industry that share an interest in sonic interaction and the use of audio for interface design. All papers are peer reviewed and published in the ACM Digital Library. S3A participated with the demo "Media device orchestration for immersive spatial audio" and was voted runner-up for best demo award http://audiomostly.com/conference-program/awards/ |
Year(s) Of Engagement Activity | 2017 |
URL | http://audiomostly.com/conference-program/awards/ |
Description | Aura Satz: The Trembling line exhibition |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | The Trembling Line is an exhibition by Aura Satz exploring acoustics, vibration, sound visualisation and musical gesture with an aim to wrest the space between sound and image to see how far these can be stretched apart before they fold back into one another. . The centrepiece of the show is the film and sound installation The Trembling Line, which explores visual and acoustic echoes between decipherable musical gestures and abstract patterning, orchestral swells and extreme slow-motion close-ups of strings and percussion. It features a score by Leo Grant and an innovative multichannel audio system by the Institute of Sound and Vibration Research (ISVR), University of Southampton, as part of the S3A research project on immersive listening. |
Year(s) Of Engagement Activity | 2015,2016 |
URL | http://www.hansardgallery.org.uk/event-detail/199-aura-satz-the-trembling-line/ |
Description | BBC Sound now and next |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | The S3A Programme Grant was represented at the Technology Fair with Demos (ar per list below) and the Radio Drama production (James Woodcock) which was showcased in the BBC Demo Room. - Towards perceptual metering for an object-based audio system [Dr Jon Francombe, University of Surrey and Yan Tang, University of Salford] - 3D Head tracking for Spatial Audio and Audio-Visual Speaker tracking [Dr Teo de Campos, University of Surrey and Dr Marcos Simon Galvez, University of Southampton] - Headphone simulation of 3D spatial audio systems in different listening environments". [Dr Rick Hughes, University of Salford and Chris Pike, BBC] The demos and radio drama generated significant attention from other attendees/external organisations and universities. The S3A Advisory Steering Board commended S3A for the rapid progress and the impact of the demos. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.bbc.co.uk/rd/blog/2015-06-sound-now-next-watch-talks-online |
Description | BBC Sounds Amazing |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Industry/academic forum for research and production industry professionals in audio and sound hosted by the BBC |
Year(s) Of Engagement Activity | 2021,2022 |
URL | https://www.bbc.co.uk/academy/events/sounds-amazing-2022/ |
Description | BBC live streamed event - Opera passion day |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | The S3A project set up and ran equipment for a "Can a soprano singer shatter glass?" experiment (breaking wine glass with opera singer's voice) for a live streamed BBC Tomorrow's World feature as part of #OperaPassion Day (https://www.bbc.co.uk/events/epdgfx/live/cvwbj5). This was held at Manchester's Museum of Science and Industry and was streamed live on the BBC events page and BBC Facebook, with video footage later featured on the front page of the main BBC website. The intention was to engage with the public on the physics of sound and the human voice, with total online views for the whole #OperaPassion Day event exceeding 0.5 million. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.bbc.co.uk/events/epdgfx/live/cvwbj5 |
Description | CVPR |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Primary international forum for computer vision and AI/machine learning research in audio-visual media. Research dissemination through papers, key-note invited talks and workshop organisation |
Year(s) Of Engagement Activity | 2021,2022,2023 |
URL | https://cvpr2023.thecvf.com |
Description | CVSSP 30th Anniversary Celebration |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Centre for Vision, Speech and Signal Processing (CVSSP) 30th Anniversary Celebration with over 500 participants from industry, government and alumni. The event themed on 'Can Machines Think' included a series of key-note talks from alumni who are international leaders in academia and industry, over 30 live demos of current research, and an open house at the centre for both industry and guests. There was also a VIP dinner hosted by the Vice-Chancellor of the University. |
Year(s) Of Engagement Activity | 2019 |
URL | http://surrey.ac.uk/cvssp |
Description | Camp Bestival 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Camp Bestival is a large family-oriented music festival taking place in Lulworth, Dorset each year over 4 days. The science tent hosts a variety of shows and demonstrations aimed at educating and inspiring children. We presented two audio demos: Binaural dummy head, and a soundbar. Participants listened on headphones while sounds were made around the head, and the soundbar beamed 3 different streams of music in different directions. Despite challenging listening conditions the demos were extremely well received, and will certainly have triggered the curiosity of some young minds. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.festicket.com/festivals/camp-bestival/2017/ |
Description | DAFx Conference Edinburgh International Conference on Digital Audio Effects |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | S3A's object-based reverberation at the International Conference on Digital Audio Effects (DAFx) 2017, Edinburgh. S3A was invited to present demonstrations at 5 poster sessions throughout the conference, leading to discussion/networking with international industry (Apple, HTC, Dolby, Ircam, Magic Leap) and academic contacts (TU Köln, Aalto, York, Audio Labs (Erlangen)). This engagement has paved the way for future impact around the reverberation approach used in S3A. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.dafx17.eca.ed.ac.uk/ |
Description | EUSIPCO 2017 European Signal Processing Conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Opportunity for S3A to be highlighted amongst colleagues in the audio signal processing and machine learning field. The paper findings generated special interest on the subject of "Any theory support such as an exact mathematical model to the proposed perceptual model in speech enhancement". |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.eusipco2017.org/ |
Description | European Conference on Visual Media Production |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation of research advances at industry-academic forum |
Year(s) Of Engagement Activity | 2021,2022 |
URL | https://www.cvmp-conference.org/ |
Description | Exhibition at the Consumer Electronics Show, Vegas |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | S3A launched a laptop-friendly 3D stereo sound bar and the world's first multipurpose disposable medical tests at the global showcase in Las Vegas, where the Future Worlds Accelerator is set to be the only UK university exhibitor for a fourth consecutive year. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.s3a-spatialaudio.org/s3a-at-ces-2019 |
Description | Interview on BBC Radio 4 |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Interview with Eddie Mair to explain the concepts and technology being developed by S3A - Specially re. speech intelligibility. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.bbc.co.uk/programmes/b09tc4q3 |
Description | Invited guest Lecture, Limerick Institute of Technology Ireland |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Guest lecture at Limerick Institute of Technology, Ireland, to an audience of undergrad and postgrad students, departmental staff etc. in the field of music production and broadcast engineering. The audience showed interest in the work, particularly narrative importance and MDO, and a number wished to be contacted about future studies and online surveys. |
Year(s) Of Engagement Activity | 2018 |
Description | Juice Audio Developer Conference 2018 - London 19-21 Nov 2018 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Head-tracked object-based binaural demonstration using the VISR Production Suite: our suite of DAW plugins for producing and reproducing object-based audio. Attendees were from mostly from audio industries and audio software companies (e.g. Dolby, Steinberg, Vienna Symphonic Library, etc. ), and some from Universities (UWE, Bristol University, etc.). People willing to try out Open-source software. |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation at Conference LVA ICA held at the University of Surrey, July 2018 |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Audio Day workshop as part of the LVA ICA 2018 Conference. Including presentations by S3A Researchers. e.g Presentation "Deep Learning for Speech Separation" (Qingju Liu) |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.surrey.ac.uk/events/20180702-lvaica-2018-14th-international-conference-latent-variable-a... |
Description | S3A visit to Parma University - Casa della Musica |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | As part of the collaboration with Parma University and Casa della Musica, S3A hosted a classical concert open to the general public. The aim of the event was to record the concert using different microphone arrays and S3A technology as well as 360 video. The concert was organised in collaboration with Parma University and the Conservatorio Arrigo Boito. The recording is being used for further research. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.comune.parma.it/notizie/news/CULTURA/2017-01-12/Progetto-S3A-Audio-spaziale-il-meeting-in... |
Description | Soundbar Technology market research |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | As part of the ICURe programme, Dr Marcos Simon Galvez has had the opportunity to discuss his research on soundbars and further use of the technology with industry (nationally and internationally). |
Year(s) Of Engagement Activity | 2016,2017 |
Description | TAUNTON Stem Festival |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Schools |
Results and Impact | S3A presented an object-based sound system with S3A technology and interactive content at a major STEM festival in Taunton. The event targeted primary and secondary school pupils. The event was covered by local press. |
Year(s) Of Engagement Activity | 2016 |
Description | UK- Korea Focal Point Workshop in Seoul, Korea / Visit research institutions in South Korea |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | 7 members in S3A visited South Korea for the 2nd UK-Korea Focal Point Workshop on Audio-Visual representation in January 2018. The main workshop was held at Yonsei university and around 60 people have attended. 4 members gave presentations about the S3A research topics at the workshop. Also visited CJ 4DX, KIST (Korea Institute Science and Tech.) and Korea university to establish new links for future research collaboration. There were several immediate areas for collaboration found and future application for collaborative research funding. Complementary research strengths in immersive media and AI are particularly strategic given the UK industrial strategy initiatives in this area. Plans for a further workshop in Seoul in this Autumn conjunction with ACM Multimedia conference has been built on the links established in this visit. |
Year(s) Of Engagement Activity | 2018 |
URL | http://ee.yonsei.ac.kr/ee_en/community/academic_notice.do?mode=view&articleNo=22313&article.offset=0... |
Description | University of Surrey - Festival of Wonder |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | The S3A's project Sound Sphere was installed at the University of Surrey's 50th year celebration "Festival of Wonder". A surround sound version of the S3A Autumn Forest radio drama was played, an the audience were able to interact with the content by moving the narrator position on an ipad. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.youtube.com/watch?v=fhRuz7q4XX0 |
Description | Vostok-K demonstration at Manchester Science Festival |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Demonstration of MDO and The Vostok-K Incident in the aerospace hall at the Science and Industry Museum in Manchester as part of Manchester Science Festival. Visitors were introduced to the concept of MDO and given a 'live' demonstration of the Taster experience. |
Year(s) Of Engagement Activity | 2018 |
Description | Winchester Cathedral Primary Science Festival |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Schools |
Results and Impact | An acoustic workshop lasting over 50 minutes was undertaken with 6 groups of 16 primary school children aged 9-11, as part of a Science Festival at Winchester Cathedral (Nov 16). The workshop was carried out by Steve Elliott and Marcos Simon Galvez and it covered activities to do with how sound travels and its speed, length and pitch in musical instruments and reverberation and localisation. The last activity involved live recordings from a dummy head to multiple headphones that the students listened to, in order to demonstrate binaural sound localisation. The feedback received from teachers was that this event had helped increase the students' knowledge of acoustic and their perception of science and engineering. |
Year(s) Of Engagement Activity | 2016 |
Description | Workshop on Intelligent Music Production, Huddersfield |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation and demonstration of new media device orchestration content (The Vostok-K Incident) to workshop attendees, including academics, postgraduates, and people from industry. The workshop was on the day after the official BBC Taster launch of the content. |
Year(s) Of Engagement Activity | 2018 |