Audio Data Exploration: New Insights and Value

Lead Research Organisation: Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science

Abstract

The "Audio Data Exploration: New Insights and Value" project is a collaboration between Audio Analytic Ltd. and the
Centre for Digital Music & Centre for Intelligent Sensing at Queen Mary University of London (QML). Compared to
mathematical, textual or visual data, audio data has remained largely underexploited and undervalued, and thus represents
an opportunity to grow innovation and to develop new markets. While Automatic Speech Recognition and Music Analysis
are now creating some industrial value, R&D needs to be conducted to tackle the challenges posed by Automatic
Environmental Sound Recognition defined in a broader sense. The newly developed advanced audio data analysis and
modelling techniques will create value across a variety of applicative domains. While proven markets include Professional
Security and Home Security, a range of novel markets can be developed in relation to Multimedia Database Indexing,
Environmental and Industrial Monitoring, the Internet of Things and more. The project will gather the newly developed
audio analysis and modelling techniques into a demonstrator instantiated as a "Personal Audio Space Indexer".

Planned Impact

ECONOMIC IMPACTS
Audio Analytic, the lead partner in this project, will directly benefit from the transfer of state-of-the-art academic research on
sound recognition into their product range. This will enable Audio Analytic to reinforce its position as a world leading
supplier of sound recognition applications. (Medium term time scale.) The company is planning to quadruple its size after
entering the consumer market, thus creating new employment in the UK (Medium term).
A wide range of players from of diversity of markets spanning professional security, home security, media indexing,
environmental/industrial monitoring and the internet of things will be able to offer a distinctive feature in the form of reliable
and smarter sound event detection as part of their products and applications. These impacts will come through both
existing and newly developed commercial relationships between Audio Analytic and such companies (Medium to long
term).
SOCIAL IMPACTS
Improved public safety: Members of the public looking to protect their safety and security will benefit from the deployment
of the new smart sound recognition as part of home security solutions and smart city solutions, to protect against events
such as, and not limited to: break-ins (detection of glass break sound, burglar alarm sound), aggressions (gunshots, calls
for help etc.), fire (detection of smoke alarm sound), domestic accidents (detection of SOS keywords, specific accident
sounds) etc. Companies and organisations looking to improve security and safety in the workplace, transport systems or
public spaces will benefit from the availability of solutions for sound-based safety and security monitoring.
Environment and public expense: In the case of accident detection, the faster response time enabled by the system may
mean shorter hospital stays and related reductions in NHS expense and CO2 emissions. In the case of environmental
monitoring: the proposed system will make it possible to compute indicators of environmental health through acoustic
monitoring.
Wellbeing of vulnerable adults: On a longer timescale, vulnerable adults in the home such as the elderly, and their families
or carers, will benefit from future enhancements to include sound monitoring for assistance calls to respond to SOS
keywords, or identification of departure from regular sound activity patterns in the home to provide reassurance and
support. Deaf members of the public may also see their quality of life improved by sound indicators that would describe
particular audio events arising in their environment. While some of these impacts are outside the scope of the current
project, the demonstrator will cover enough general sound recognition cases to enable the long term expansion of market
segments by Audio Analytic.
ADDITIONAL IMPACTS
People: The postdoctoral researcher (PDRA) will benefit through gaining skills in knowledge transfer and application of
research technology in a commercial setting, applicable to their future career. (Short term.)
Transfer of technology best practice: Academic research in the domain of sound recognition will benefit from access to a
real-life platform and real-life data to evaluate novel sound recognition algorithms. This project will set a precedent for best
practice in algorithm evaluation, maths optimisation and transfer of technology between the academia and the industry.
(Medium to long term.)

Publications

10 25 50
 
Description Our experiments demonstrate that including temporal information in the input representation helps improve the performance of classifiers for environmental audio data. Autoencoders appear to be effective at summarising temporal information over short time-scales. We observed that spherical k-means as a feature learning technique, is able to learn features that perform similarly to autoencoder features. This is a useful finding, since spherical k-means can be trained more efficiently on very large datasets than autoencoders.
Exploitation Route The findings are being investigated further by project partners Audio Analytic for use in their product range.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description The findings were investigated by project partners Audio Analytic for use in their product range.
First Year Of Impact 2016
Sector Digital/Communication/Information Technologies (including Software),Security and Diplomacy
Impact Types Economic

 
Description Audio-Visual Media Research Platform
Amount £1,577,223 (GBP)
Funding ID EP/P022529/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 08/2017 
End 07/2022
 
Description H2020-ICT-2015 Audio Commons
Amount € 2,980,000 (EUR)
Funding ID 688382 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 02/2016 
End 01/2019
 
Title CHiME-Home 
Description The CHiME-Home dataset is a collection of annotated domestic environment audio recordings, described in: P. Foster, S. Sigtia, S. Krstulovic, J. Barker, M. D. Plumbley. "CHiME-Home: A Dataset for Sound Source Recognition in a Domestic Environment," in Proceedings of the 11th Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015; and available from https://archive.org/details/chime-home 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
Impact The dataset is being used in task 4 of the DCASE2106 Challenge, for performance evaluation of systems for the detection and classification of sound events. The challenge is organised by the Audio Research Group of Tampere University of Technology, by the QMUL Centre for Digital Music and by IRCCYN, and by the University of Surrey, under the auspices of the Audio and Acoustic Signal Processing (AASP) technical committee of the IEEE Signal Processing Society. See http://www.cs.tut.fi/sgn/arg/dcase2016/task-audio-tagging 
URL https://archive.org/details/chime-home