Perception and Automated Assessment of Recorded Audio Quality, Especially User Generated Content

Lead Research Organisation: University of Salford
Department Name: Sch of Computing, Science & Engineering

Abstract

Many of us now carry around technologies which allow us to record sound, whether that is the sound of our child's first music concert on a digital camera or a recording of a practical joke on a mobile phone. Nowadays, there are many outlets for this user generated content. Last year alone, 13 million hours of video was uploaded to YouTube. Even professional broadcasters rely on this footage. Mainstream news bulletins regularly use amateur footage of dramatic events (e.g. Concorde crashing) and some TV programmes such as Rude Tube are entirely made up of user generated content.

However, the production quality of the sound on user-generated content is often very poor: distorted, noisy, with garbled speech or indistinct music. Our interest lies in the causes of the poor recording, especially what happens between the sound source and the electronic signal emerging from the microphone. Typical problems include: speaking off microphone; distorted speech due to clipping; wind noise and microphone handling noise. We are interested in audio recorded on its own, as well as soundtracks accompanying videos.

We want to improve the recording quality so that more user-generated audio can be widely used and re-used creatively. To do this we will develop an understanding of how recording errors are perceived as it is unclear how noise and distortion affects the perception of the audio quality for many sounds. We will develop algorithms for automatically evaluating audio quality from the poor recording.

A method for evaluating recorded audio quality has many potential uses. When media is received by a broadcast organisation, whether submitted by an amateur or professional, a rapid quality assessment could determine whether the sound is of broadcast quality without time consuming auditioning.

Searching for sounds on the Internet for creative re-use is a frustrating activity as it is difficult to find recordings, and those that are found are often of poor quality. An audio quality assessment method would make it possible to tag and search sound files for content and quality.

Even better, it would be possible to use the audio quality rating at the time of recording to try and improve the quality of the captured sound. A simple warning displayed on the recording device would give an opportunity to correct mistakes (a warning light when someone is being recorded off-mic). Furthermore, a rating of audio quality could be used to produce devices which automatically correct common recording errors. The medium term aim of this research is to develop such algorithms to correct common recording errors, however, a pre-requisite is a method by which the quality of audio can be evaluated. And so that is the focus of this proposed project.

Planned Impact

Developing knowledge: The project will produce new knowledge and understanding about how sound quality is perceived and develop new techniques for characterising audio quality in blind signal processing. While the focus is on solving a specific problem, this knowledge should be of use to a broader community, for example understanding quality issues for live sound reproduction. Dissemination of this knowledge will be through academic routes (conferences, journals, British Library sound archivists), media (news stories, Internet) and industry (via BBC Audio Research Partnership).

Enhancing cultural enrichment: We want to improve the quality of recorded audio. Better quality audio opens up the possibility of more creative re-use including encouraging members of the public to capture and appreciate sounds, in the similar way to images being captured on cameras. Improving audio recording quality will come via our engagement with industry (see below)

Open source and standards: The databases and algorithms will be available open source to allow them to be exploited in research and development. Following an open source philosophy also faciliatates the dissemination via an international standard based on the evaluation technique, which is one method to create impact by encouraging adoption of the technology. Openness will be at the heart of the project, with databases and results being made available publically (where rights allow) and papers being published in journals that also allow open publication via the University of Salford's institutional repository.

Industry: BBC R&D and the British Library are partnering on the project and will help shape the work so the outcomes are relevant and useful to the broader media industry and sound archivists. The new BBC Audio Partnership will give a forum to allow pro-active dissemination within the BBC, to SMEs and other companies. Appropriate companies will be identified and seminars and meetings arrange to discuss how the outcomes of the research can be exploited.

Public: Crowd sourcing forms part of the project during the ecologically more valid perceptual testing. Consequently, engaging the public is integral to the research project. Features, magazine programmes and news items will also be used to disseminate information about the project drawing on the PIs expertise as a Senior Media Fellow.
 
Description Sound recording is become ever popular, most commonly as soundtracks on videos captured on mobile devices. Unfortunately, the quality of the audio is often poor. The project developed an understanding of how different recordings errors affect perceived quality of audio. It then developed a number of algorithms that detect noise due to (i) wind, (ii) handling noise and (iii) distortion.
Exploitation Route The algorithms have been made available to be exploited by the audio industry and also makers of environmental noise measurement systems. We continue to try and find partners to exploit the work further.
Sectors Digital/Communication/Information Technologies (including Software),Environment

URL http://www.goodrecording.net/
 
Description Software used by http://chirpomatic.com/ to filter training data for bird song identification app.
First Year Of Impact 2016
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic