Perceptual Sound Field Reconstruction and Coherent Emulation

Lead Research Organisation: University of Surrey
Department Name: Sound Recording

Abstract

The project is concerned with the development of a new 5--10 channel audio technology which would improve over existing ones in terms of (a) realism, (b) accuracy and stability of the auditory perspective, (c) size of the sweet spot, and (d) the envelopment experience. Since the new technology aims to create a 360 degrees auditory perspective, the reproduction will take place over speakers positioned at vertices of a regular polygon. Each speaker will consist of two components, one which will radiate the direct sound field toward a listener, and another which will reproduce diffuse sound field by introducing additional scattering. The goal of the particular tasks, listed below, is to find optimal ways to capture sound field cues and render them using the proposed playback system in a manner which would provide the most convincing illusion of the original or desired sound field.(i) Optimal microphone arrays for the proposed play-back system will be investigated. Arrays considered will consist of microphones placed in the horizontal plane at the vertices of a regular polygon, with the number of microphones equal to the number of speakers. For each array, different diameters, in the range from near coincident up to somewhat beyond the optimal value, and different microphone directivity patterns will be considered. These studies will be repreated for a few diameters of the speaker configuration to investigate if the optimal array diameter depends on the size of the speaker lay-out, and if so to characterize that dependence. Possible dependencies between the optimal microphone directivity patterns and array diameters will be also investigated and characterized. Arrays will be evaluated in critical listening tests according to criteria (a)--(d) stated in the above. Experiments will be guided by simulations which would provide initial objective assessment of ITD and ILD cues generated within the listening area. In parallel, mathematical models of sound fields generated by the proposed technology will be investigated, which could provide some additional insight into the optimal microphone array design. (ii) The impact of play-back with cross-talk cancellation will be be systematically investigated. Existing cross-talk cancellation algorithms will be first used, and if necessary, new algorithms which are numerically efficient and effective in a range of listening environments will be developed. Then optimal microphone arrays for play back with cross-talk cancellation will be investigated, i.e. the work described under (i) will be repeated for reproduction with cross-talk cancellation. Finally, optimal systems with and without cross-talk cancellation will be compared.(iii) Algorithms for direct/diffuse sound field separation will be studied. When the number of instruments does not exceed the number of microphones, multichannel equalization techniques can be used to find dry source signals, which can then be convolved with direct/reverberant parts of room impulse responses to obtain direct/diffuse sound field components, respectively. Multichannel equalization in audio is, however, particularly challenging owing to excessively long impulse responses, and we will develop numerically efficient algorithms for multichannel equalization for audio applications. Then we will study psychoacoustic approximation to direct/diffuse sound field decomposition with no restriction on the number of sources. (iv) Combinations of near-coincident directional microphone arrays, for acquiring direct sound field cues, and widely spaced arrays based on omni-directional or bi-directional microphones, for acquiring diffuse sound field cues, will be systematically investigated in critical listening tests according to criteria (a)--(d). This approach will be evaluated in comparison with the approach described in (i)--(iii) where the same array is used for both sound field components.

Publications

10 25 50
 
Description The project was concerned with the development of a new 5 to 10 channel audio technology which would improve over existing ones in terms of (a) realism, (b) accuracy and stability of the auditory perspective, (c) size of the sweet spot, and (d) the envelopment experience.
Based on the principal horizontal localisation cues of interaural time and level differences, an optimal loudspeaker arrangement of 8 loudspeakers (each at the vertex of a regular octagon) was derived. This allows optimum localisation over the whole 360 degrees of the horizontal plane around the listener whilst still using a practical number of loudspeakers. It also allows for reasonable robustness to errors caused by listener movement. A subjective experiment was undertaken which demonstrated the increased accuracy in localisation afforded by this system in comparison to a conventional 5.1 surround sound system.
A series of controlled subjective experiments were undertaken to determine the localisation curves for this loudspeaker layout. This involved reproducing a range of signals with a range of interchannel level and time differences to listeners, and asking them to judge the perceived location of each stimulus. The results of these experiments were in the form of localisation maps which allow the perceived location of a stimulus to be predicted based upon the interchannel level and time differences for a wide range of expected values.
The localisation maps were then used to develop optimal microphone arrays. The arrays considered consisted of microphones placed in the horizontal plane at the vertices of a regular polygon, with the number of microphones equal to the number of speakers. For each array, different diameters, in the range from near coincident up to somewhat beyond the optimal value, and different first-order microphone directivity patterns were considered.
A further subjective experiment was conducted to evaluate the accuracy of the predictions of the localisation performance of each microphone array. It was found that for the arrays which resulted in lower levels of cross-talk the predictions were accurate (within the confidence intervals of the subjective judgements). For the arrays which resulted in higher levels of cross-talk the predictions were less accurate, but these arrays also suffered from poorer localisation performance, and poorer robustness to errors caused by listener movement.
Overall, it was found that the 8-channel loudspeaker array arranged with the loudspeakers at the vertices of a regular octagon was the optimum practical loudspeaker arrangement for accurate and homogeneous localisation in the horizontal plane. The optimum microphone array for this was found to be cardioid microphones arranged with each microphone at the vertex of a regular octagon, with a diameter of 2.5m. The results indicated that the combination of this microphone array and loudspeaker array gave relatively accurate localisation cues around the full 360 degrees of the horizontal plane, and also good performance in terms of source width, spaciousness, and envelopment.
Exploitation Route The project has resulted in useful outcomes in three distinct areas.

Firstly, the loudspeaker layout that was developed gives homogeneous performance around the horizontal plane, with relatively good localisation for a range of listening positions. This provides a significant improvement compared to conventional 5.1 surround sound.

Secondly, a method to optimise microphone arrays for a given loudspeaker layout was developed based on attempting to recreate realistic perceptual cues at the listening position. This has resulted in new microphone array possibilities, and demonstrated the limitations of conventional coincident techniques.

Finally, the results from the listening tests can be used in other areas of psychoacoustic research, for instance as use as a data set for development of psychoacoustic models of localisation.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics

URL http://www.surrey.ac.uk/msr/people/laurent_simon/index.htm
 
Description The results of the research have been used as follows. Firstly, the research has contributed to the development of new surround sound systems beyond 5.1. New formats including additional loudspeakers, both positioned on the horizontal plane and elevated above the horizontal plane, have made use of the research undertaken in this project. Secondly, the development of the microphone techniques have demonstrated the limitations of coincident first-order arrays, and led to the development of both spaced arrays and higher-order microphone configurations. Finally, the listening test data has been used to assist in the development of psychoacoustic hearing models to predict perceived localisation.
First Year Of Impact 2010
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic