Audio Data Exploration: New Insights and Value

Lead Research Organisation: Queen Mary University of London

Department Name: Sch of Electronic Eng & Computer Science

Abstract

The "Audio Data Exploration: New Insights and Value" project is a collaboration between Audio Analytic Ltd. and the
Centre for Digital Music & Centre for Intelligent Sensing at Queen Mary University of London (QML). Compared to
mathematical, textual or visual data, audio data has remained largely underexploited and undervalued, and thus represents
an opportunity to grow innovation and to develop new markets. While Automatic Speech Recognition and Music Analysis
are now creating some industrial value, R&D needs to be conducted to tackle the challenges posed by Automatic
Environmental Sound Recognition defined in a broader sense. The newly developed advanced audio data analysis and
modelling techniques will create value across a variety of applicative domains. While proven markets include Professional
Security and Home Security, a range of novel markets can be developed in relation to Multimedia Database Indexing,
Environmental and Industrial Monitoring, the Internet of Things and more. The project will gather the newly developed
audio analysis and modelling techniques into a demonstrator instantiated as a "Personal Audio Space Indexer".

Planned Impact

ECONOMIC IMPACTS
Audio Analytic, the lead partner in this project, will directly benefit from the transfer of state-of-the-art academic research on
sound recognition into their product range. This will enable Audio Analytic to reinforce its position as a world leading
supplier of sound recognition applications. (Medium term time scale.) The company is planning to quadruple its size after
entering the consumer market, thus creating new employment in the UK (Medium term).
A wide range of players from of diversity of markets spanning professional security, home security, media indexing,
environmental/industrial monitoring and the internet of things will be able to offer a distinctive feature in the form of reliable
and smarter sound event detection as part of their products and applications. These impacts will come through both
existing and newly developed commercial relationships between Audio Analytic and such companies (Medium to long
term).
SOCIAL IMPACTS
Improved public safety: Members of the public looking to protect their safety and security will benefit from the deployment
of the new smart sound recognition as part of home security solutions and smart city solutions, to protect against events
such as, and not limited to: break-ins (detection of glass break sound, burglar alarm sound), aggressions (gunshots, calls
for help etc.), fire (detection of smoke alarm sound), domestic accidents (detection of SOS keywords, specific accident
sounds) etc. Companies and organisations looking to improve security and safety in the workplace, transport systems or
public spaces will benefit from the availability of solutions for sound-based safety and security monitoring.
Environment and public expense: In the case of accident detection, the faster response time enabled by the system may
mean shorter hospital stays and related reductions in NHS expense and CO2 emissions. In the case of environmental
monitoring: the proposed system will make it possible to compute indicators of environmental health through acoustic
monitoring.
Wellbeing of vulnerable adults: On a longer timescale, vulnerable adults in the home such as the elderly, and their families
or carers, will benefit from future enhancements to include sound monitoring for assistance calls to respond to SOS
keywords, or identification of departure from regular sound activity patterns in the home to provide reassurance and
support. Deaf members of the public may also see their quality of life improved by sound indicators that would describe
particular audio events arising in their environment. While some of these impacts are outside the scope of the current
project, the demonstrator will cover enough general sound recognition cases to enable the long term expansion of market
segments by Audio Analytic.
ADDITIONAL IMPACTS
People: The postdoctoral researcher (PDRA) will benefit through gaining skills in knowledge transfer and application of
research technology in a commercial setting, applicable to their future career. (Short term.)
Transfer of technology best practice: Academic research in the domain of sound recognition will benefit from access to a
real-life platform and real-life data to evaluate novel sound recognition algorithms. This project will set a precedent for best
practice in algorithm evaluation, maths optimisation and transfer of technology between the academia and the industry.
(Medium to long term.)

Funded Value:

£77,442

Funded Period:

Nov 14 - Sep 15

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/M507088/1

Principal Investigator:

Simon Dixon

Mark Plumbley

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (40%)

Music & Acoustic Technology (60%)

Organisations

Queen Mary University of London (Lead Research Organisation)

People	ORCID iD
Simon Dixon (Principal Investigator)	http://orcid.org/0000-0002-6098-481X
Mark Plumbley (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Benetos E (2016) Detection of overlapping acoustic events using a temporally-constrained probabilistic model

Foster P (2015) Chime-home: A dataset for sound source recognition in a domestic environment

Sigtia S (2016) Automatic Environmental Sound Recognition: Performance Versus Computational Cost in IEEE/ACM Transactions on Audio, Speech, and Language Processing

Sigtia S (2016) Automatic Environmental Sound Recognition: Performance versus Computational Cost

Xu Y (2017) Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging in IEEE/ACM Transactions on Audio, Speech, and Language Processing

Key Findings
Impact Summary
Further Funding
Research Databases and Models


Description	Our experiments demonstrate that including temporal information in the input representation helps improve the performance of classifiers for environmental audio data. Autoencoders appear to be effective at summarising temporal information over short time-scales. We observed that spherical k-means as a feature learning technique, is able to learn features that perform similarly to autoencoder features. This is a useful finding, since spherical k-means can be trained more efficiently on very large datasets than autoencoders.
Exploitation Route	The findings are being investigated further by project partners Audio Analytic for use in their product range.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	The findings were investigated by project partners Audio Analytic for use in their product range.
First Year Of Impact	2016
Sector	Digital/Communication/Information Technologies (including Software),Security and Diplomacy
Impact Types	Economic


Description	Audio-Visual Media Research Platform
Amount	£1,577,223 (GBP)
Funding ID	EP/P022529/1
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start	08/2017
End	07/2022


Description	H2020-ICT-2015 Audio Commons
Amount	€ 2,980,000 (EUR)
Funding ID	688382
Organisation	European Commission
Sector	Public
Country	European Union (EU)
Start	02/2016
End	01/2019


Title	CHiME-Home
Description	The CHiME-Home dataset is a collection of annotated domestic environment audio recordings, described in: P. Foster, S. Sigtia, S. Krstulovic, J. Barker, M. D. Plumbley. "CHiME-Home: A Dataset for Sound Source Recognition in a Domestic Environment," in Proceedings of the 11th Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015; and available from https://archive.org/details/chime-home
Type Of Material	Database/Collection of data
Year Produced	2015
Provided To Others?	Yes
Impact	The dataset is being used in task 4 of the DCASE2106 Challenge, for performance evaluation of systems for the detection and classification of sound events. The challenge is organised by the Audio Research Group of Tampere University of Technology, by the QMUL Centre for Digital Music and by IRCCYN, and by the University of Surrey, under the auspices of the Audio and Acoustic Signal Processing (AASP) technical committee of the IEEE Signal Processing Society. See http://www.cs.tut.fi/sgn/arg/dcase2016/task-audio-tagging
URL	https://archive.org/details/chime-home

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications