Perception and Automated Assessment of Recorded Audio Quality, Especially User Generated Content

Lead Research Organisation: University of Salford

Department Name: Sch of Computing, Science & Engineering

Abstract

Many of us now carry around technologies which allow us to record sound, whether that is the sound of our child's first music concert on a digital camera or a recording of a practical joke on a mobile phone. Nowadays, there are many outlets for this user generated content. Last year alone, 13 million hours of video was uploaded to YouTube. Even professional broadcasters rely on this footage. Mainstream news bulletins regularly use amateur footage of dramatic events (e.g. Concorde crashing) and some TV programmes such as Rude Tube are entirely made up of user generated content.

However, the production quality of the sound on user-generated content is often very poor: distorted, noisy, with garbled speech or indistinct music. Our interest lies in the causes of the poor recording, especially what happens between the sound source and the electronic signal emerging from the microphone. Typical problems include: speaking off microphone; distorted speech due to clipping; wind noise and microphone handling noise. We are interested in audio recorded on its own, as well as soundtracks accompanying videos.

We want to improve the recording quality so that more user-generated audio can be widely used and re-used creatively. To do this we will develop an understanding of how recording errors are perceived as it is unclear how noise and distortion affects the perception of the audio quality for many sounds. We will develop algorithms for automatically evaluating audio quality from the poor recording.

A method for evaluating recorded audio quality has many potential uses. When media is received by a broadcast organisation, whether submitted by an amateur or professional, a rapid quality assessment could determine whether the sound is of broadcast quality without time consuming auditioning.

Searching for sounds on the Internet for creative re-use is a frustrating activity as it is difficult to find recordings, and those that are found are often of poor quality. An audio quality assessment method would make it possible to tag and search sound files for content and quality.

Even better, it would be possible to use the audio quality rating at the time of recording to try and improve the quality of the captured sound. A simple warning displayed on the recording device would give an opportunity to correct mistakes (a warning light when someone is being recorded off-mic). Furthermore, a rating of audio quality could be used to produce devices which automatically correct common recording errors. The medium term aim of this research is to develop such algorithms to correct common recording errors, however, a pre-requisite is a method by which the quality of audio can be evaluated. And so that is the focus of this proposed project.

Planned Impact

Developing knowledge: The project will produce new knowledge and understanding about how sound quality is perceived and develop new techniques for characterising audio quality in blind signal processing. While the focus is on solving a specific problem, this knowledge should be of use to a broader community, for example understanding quality issues for live sound reproduction. Dissemination of this knowledge will be through academic routes (conferences, journals, British Library sound archivists), media (news stories, Internet) and industry (via BBC Audio Research Partnership).

Enhancing cultural enrichment: We want to improve the quality of recorded audio. Better quality audio opens up the possibility of more creative re-use including encouraging members of the public to capture and appreciate sounds, in the similar way to images being captured on cameras. Improving audio recording quality will come via our engagement with industry (see below)

Open source and standards: The databases and algorithms will be available open source to allow them to be exploited in research and development. Following an open source philosophy also faciliatates the dissemination via an international standard based on the evaluation technique, which is one method to create impact by encouraging adoption of the technology. Openness will be at the heart of the project, with databases and results being made available publically (where rights allow) and papers being published in journals that also allow open publication via the University of Salford's institutional repository.

Industry: BBC R&D and the British Library are partnering on the project and will help shape the work so the outcomes are relevant and useful to the broader media industry and sound archivists. The new BBC Audio Partnership will give a forum to allow pro-active dissemination within the BBC, to SMEs and other companies. Appropriate companies will be identified and seminars and meetings arrange to discuss how the outcomes of the research can be exploited.

Public: Crowd sourcing forms part of the project during the ecologically more valid perceptual testing. Consequently, engaging the public is integral to the research project. Features, magazine programmes and news items will also be used to disseminate information about the project drawing on the PIs expertise as a Senior Media Fellow.

Funded Value:

£456,986

Funded Period:

Apr 12 - Apr 15

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/J013013/1

Principal Investigator:

Trevor Cox

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Music & Acoustic Technology (100%)

Organisations

People	ORCID iD
Trevor Cox (Principal Investigator)
Francis Li (Co-Investigator)
Bruno Fazenda (Co-Investigator)
Paul Kendrick (Researcher Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Fazenda B (2016) Perception and automated assessment of audio quality in user generated content: An improved model

Jackson IR (2014) Perception and automatic detection of wind-induced microphone noise. in The Journal of the Acoustical Society of America

Kendrick P (2015) Microphone Handling Noise: Measurements of Perceptual Threshold and Effects on Audio Quality. in PloS one

Kendrick P (2015) Perceived Audio Quality of Sounds Degraded by Nonlinear Distortions and Single-Ended Assessment Using HASQI in Journal of the Audio Engineering Society

Kendrick P (2016) The effect of microphone wind noise on the amplitude modulation of wind turbine noise and its mitigation. in The Journal of the Acoustical Society of America

Kendrick P (2015) Using blind signal processing algorithms to remove wind noise from environmental noise assessments: A wind turbine amplitude modulation case study in The Journal of the Acoustical Society of America

Key Findings
Impact Summary


Description	Sound recording is become ever popular, most commonly as soundtracks on videos captured on mobile devices. Unfortunately, the quality of the audio is often poor. The project developed an understanding of how different recordings errors affect perceived quality of audio. It then developed a number of algorithms that detect noise due to (i) wind, (ii) handling noise and (iii) distortion.
Exploitation Route	The algorithms have been made available to be exploited by the audio industry and also makers of environmental noise measurement systems. We continue to try and find partners to exploit the work further.
Sectors	Digital/Communication/Information Technologies (including Software),Environment
URL	http://www.goodrecording.net/


Description	Software used by http://chirpomatic.com/ to filter training data for bird song identification app.
First Year Of Impact	2016
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Economic

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications