Audio Data Exploration: New Insights and Value
Lead Research Organisation:
Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science
Abstract
The "Audio Data Exploration: New Insights and Value" project is a collaboration between Audio Analytic Ltd. and the
Centre for Digital Music & Centre for Intelligent Sensing at Queen Mary University of London (QML). Compared to
mathematical, textual or visual data, audio data has remained largely underexploited and undervalued, and thus represents
an opportunity to grow innovation and to develop new markets. While Automatic Speech Recognition and Music Analysis
are now creating some industrial value, R&D needs to be conducted to tackle the challenges posed by Automatic
Environmental Sound Recognition defined in a broader sense. The newly developed advanced audio data analysis and
modelling techniques will create value across a variety of applicative domains. While proven markets include Professional
Security and Home Security, a range of novel markets can be developed in relation to Multimedia Database Indexing,
Environmental and Industrial Monitoring, the Internet of Things and more. The project will gather the newly developed
audio analysis and modelling techniques into a demonstrator instantiated as a "Personal Audio Space Indexer".
Centre for Digital Music & Centre for Intelligent Sensing at Queen Mary University of London (QML). Compared to
mathematical, textual or visual data, audio data has remained largely underexploited and undervalued, and thus represents
an opportunity to grow innovation and to develop new markets. While Automatic Speech Recognition and Music Analysis
are now creating some industrial value, R&D needs to be conducted to tackle the challenges posed by Automatic
Environmental Sound Recognition defined in a broader sense. The newly developed advanced audio data analysis and
modelling techniques will create value across a variety of applicative domains. While proven markets include Professional
Security and Home Security, a range of novel markets can be developed in relation to Multimedia Database Indexing,
Environmental and Industrial Monitoring, the Internet of Things and more. The project will gather the newly developed
audio analysis and modelling techniques into a demonstrator instantiated as a "Personal Audio Space Indexer".
Planned Impact
ECONOMIC IMPACTS
Audio Analytic, the lead partner in this project, will directly benefit from the transfer of state-of-the-art academic research on
sound recognition into their product range. This will enable Audio Analytic to reinforce its position as a world leading
supplier of sound recognition applications. (Medium term time scale.) The company is planning to quadruple its size after
entering the consumer market, thus creating new employment in the UK (Medium term).
A wide range of players from of diversity of markets spanning professional security, home security, media indexing,
environmental/industrial monitoring and the internet of things will be able to offer a distinctive feature in the form of reliable
and smarter sound event detection as part of their products and applications. These impacts will come through both
existing and newly developed commercial relationships between Audio Analytic and such companies (Medium to long
term).
SOCIAL IMPACTS
Improved public safety: Members of the public looking to protect their safety and security will benefit from the deployment
of the new smart sound recognition as part of home security solutions and smart city solutions, to protect against events
such as, and not limited to: break-ins (detection of glass break sound, burglar alarm sound), aggressions (gunshots, calls
for help etc.), fire (detection of smoke alarm sound), domestic accidents (detection of SOS keywords, specific accident
sounds) etc. Companies and organisations looking to improve security and safety in the workplace, transport systems or
public spaces will benefit from the availability of solutions for sound-based safety and security monitoring.
Environment and public expense: In the case of accident detection, the faster response time enabled by the system may
mean shorter hospital stays and related reductions in NHS expense and CO2 emissions. In the case of environmental
monitoring: the proposed system will make it possible to compute indicators of environmental health through acoustic
monitoring.
Wellbeing of vulnerable adults: On a longer timescale, vulnerable adults in the home such as the elderly, and their families
or carers, will benefit from future enhancements to include sound monitoring for assistance calls to respond to SOS
keywords, or identification of departure from regular sound activity patterns in the home to provide reassurance and
support. Deaf members of the public may also see their quality of life improved by sound indicators that would describe
particular audio events arising in their environment. While some of these impacts are outside the scope of the current
project, the demonstrator will cover enough general sound recognition cases to enable the long term expansion of market
segments by Audio Analytic.
ADDITIONAL IMPACTS
People: The postdoctoral researcher (PDRA) will benefit through gaining skills in knowledge transfer and application of
research technology in a commercial setting, applicable to their future career. (Short term.)
Transfer of technology best practice: Academic research in the domain of sound recognition will benefit from access to a
real-life platform and real-life data to evaluate novel sound recognition algorithms. This project will set a precedent for best
practice in algorithm evaluation, maths optimisation and transfer of technology between the academia and the industry.
(Medium to long term.)
Audio Analytic, the lead partner in this project, will directly benefit from the transfer of state-of-the-art academic research on
sound recognition into their product range. This will enable Audio Analytic to reinforce its position as a world leading
supplier of sound recognition applications. (Medium term time scale.) The company is planning to quadruple its size after
entering the consumer market, thus creating new employment in the UK (Medium term).
A wide range of players from of diversity of markets spanning professional security, home security, media indexing,
environmental/industrial monitoring and the internet of things will be able to offer a distinctive feature in the form of reliable
and smarter sound event detection as part of their products and applications. These impacts will come through both
existing and newly developed commercial relationships between Audio Analytic and such companies (Medium to long
term).
SOCIAL IMPACTS
Improved public safety: Members of the public looking to protect their safety and security will benefit from the deployment
of the new smart sound recognition as part of home security solutions and smart city solutions, to protect against events
such as, and not limited to: break-ins (detection of glass break sound, burglar alarm sound), aggressions (gunshots, calls
for help etc.), fire (detection of smoke alarm sound), domestic accidents (detection of SOS keywords, specific accident
sounds) etc. Companies and organisations looking to improve security and safety in the workplace, transport systems or
public spaces will benefit from the availability of solutions for sound-based safety and security monitoring.
Environment and public expense: In the case of accident detection, the faster response time enabled by the system may
mean shorter hospital stays and related reductions in NHS expense and CO2 emissions. In the case of environmental
monitoring: the proposed system will make it possible to compute indicators of environmental health through acoustic
monitoring.
Wellbeing of vulnerable adults: On a longer timescale, vulnerable adults in the home such as the elderly, and their families
or carers, will benefit from future enhancements to include sound monitoring for assistance calls to respond to SOS
keywords, or identification of departure from regular sound activity patterns in the home to provide reassurance and
support. Deaf members of the public may also see their quality of life improved by sound indicators that would describe
particular audio events arising in their environment. While some of these impacts are outside the scope of the current
project, the demonstrator will cover enough general sound recognition cases to enable the long term expansion of market
segments by Audio Analytic.
ADDITIONAL IMPACTS
People: The postdoctoral researcher (PDRA) will benefit through gaining skills in knowledge transfer and application of
research technology in a commercial setting, applicable to their future career. (Short term.)
Transfer of technology best practice: Academic research in the domain of sound recognition will benefit from access to a
real-life platform and real-life data to evaluate novel sound recognition algorithms. This project will set a precedent for best
practice in algorithm evaluation, maths optimisation and transfer of technology between the academia and the industry.
(Medium to long term.)
Publications
Sigtia S
(2016)
Automatic Environmental Sound Recognition: Performance Versus Computational Cost
in IEEE/ACM Transactions on Audio, Speech, and Language Processing
Xu Y
(2017)
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
in IEEE/ACM Transactions on Audio, Speech, and Language Processing
Description | Our experiments demonstrate that including temporal information in the input representation helps improve the performance of classifiers for environmental audio data. Autoencoders appear to be effective at summarising temporal information over short time-scales. We observed that spherical k-means as a feature learning technique, is able to learn features that perform similarly to autoencoder features. This is a useful finding, since spherical k-means can be trained more efficiently on very large datasets than autoencoders. |
Exploitation Route | The findings are being investigated further by project partners Audio Analytic for use in their product range. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The findings were investigated by project partners Audio Analytic for use in their product range. |
First Year Of Impact | 2016 |
Sector | Digital/Communication/Information Technologies (including Software),Security and Diplomacy |
Impact Types | Economic |
Description | Audio-Visual Media Research Platform |
Amount | £1,577,223 (GBP) |
Funding ID | EP/P022529/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 08/2017 |
End | 07/2022 |
Description | H2020-ICT-2015 Audio Commons |
Amount | € 2,980,000 (EUR) |
Funding ID | 688382 |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 02/2016 |
End | 01/2019 |
Title | CHiME-Home |
Description | The CHiME-Home dataset is a collection of annotated domestic environment audio recordings, described in: P. Foster, S. Sigtia, S. Krstulovic, J. Barker, M. D. Plumbley. "CHiME-Home: A Dataset for Sound Source Recognition in a Domestic Environment," in Proceedings of the 11th Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015; and available from https://archive.org/details/chime-home |
Type Of Material | Database/Collection of data |
Year Produced | 2015 |
Provided To Others? | Yes |
Impact | The dataset is being used in task 4 of the DCASE2106 Challenge, for performance evaluation of systems for the detection and classification of sound events. The challenge is organised by the Audio Research Group of Tampere University of Technology, by the QMUL Centre for Digital Music and by IRCCYN, and by the University of Surrey, under the auspices of the Audio and Acoustic Signal Processing (AASP) technical committee of the IEEE Signal Processing Society. See http://www.cs.tut.fi/sgn/arg/dcase2016/task-audio-tagging |
URL | https://archive.org/details/chime-home |