ActivATOR - Active AudiTiOn for Robots

Lead Research Organisation: University of Southampton

Department Name: Sch of Electronics and Computer Sci

Abstract

Life in sound occurs in motion. As human listeners, audition - the ability to listen - is shaped by physical interactions between our bodies and the environment. We integrate motion with auditory perception in order to hear better (e.g., by approaching sound sources of interest), to identify objects (e.g., by touching objects and listening to the resulting sound), to detect faults (e.g., by moving objects to listen to anomalous creaks), and to offload thought (e.g., by tapping surfaces to recall musical pieces).

Therefore, the ability to make sense of and exploit sounds in motion is a fundamental prerequisite for embodied Artificial Intelligence (AI). This project will pioneer the underpinning, probabilistic framework for active robot audition that enables embodied agents to control the motion of their own bodies ('ego-motion') for auditory attention in realistic, acoustic environments (households, public spaces, and environments involving multiple, competing sound sources).

By integrating sound with motion, this project will enable machines to imagine, control and leverage the auditory consequences of physical interactions with the environment. By transforming the ways in which machines make sense of life in sound, the research outcomes will be pivotal for new, emerging markets that enable robots to augment, rather than rival, humans in order to surpass the limitations of the human body (sensory accuracy, strength, endurance, memory). Therefore, the proposed research has the potential to transform and disrupt a whole host of industries involving machine listening, ranging from human-robot augmentation (smart prosthetics, assistive listening technology, brain-computer interfaces) to human-robot collaboration (planetary exploration, search-and-rescue, hazardous material removal) and automation (environmental monitoring, autonomous vehicles, AI-assisted diagnosis in healthcare).

This project will consider the specific case study of a collaborative robot ('cobot') that augments the auditory experience of a hearing-impaired human partner. Hearing loss is the second most common disability in the UK, affecting 11M people. The loss of hearing affects situational awareness as well as the ability to communicate, which can impact on mental health and, in extreme cases, cognitive function. Nevertheless, for complex reasons that range from discomfort to social stigma, only 2M people choose to wear hearing aids.

The ambition of this project is to develop a cobot that will augment the auditory experience of a hearing-impaired person. The cobot will move autonomously within the human partner's household to assist with everyday tasks. Our research will enable the cobot to exploit ego-motion in order to learn an internal representation of the acoustic scene (children chattering, kettle boiling, spouse calling for help). The cobot will interface with its partner through an on-person smart device (watch, mobile phone). Using the human-cobot interface, the cobot will alert its partner of salient events (call for help) via vibrating messages, and share its auditory experiences via interactive maps that visualise auditory cues and indicate saliency (e.g., loudness, spontaneity) and valence (positive vs concerning).

In contrast to smart devices, the cobot will have the unique capability to actively attend to and explore uncertain events (thump upstairs), and take action (assist spouse, call ambulance) without the need for permanently installed devices in personal spaces (bathroom, bedroom). Therefore, the project has the potential to transform the lives of people with hearing impairments by enabling long-term independent living, safeguarding privacy, and fostering inclusivity.

Funded Value:

£443,263

Funded Period:

Apr 24 - Oct 27

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/W017466/1

Principal Investigator:

Christine Evers

Research Subject:

Info. & commun. Technol. (85%)

Mechanical engineering (15%)

Research Topic:

Artificial Intelligence (15%)

Digital Signal Processing (15%)

Instrumentation Eng. & Dev. (15%)

Music & Acoustic Technology (20%)

Vision & Senses - ICT appl. (35%)

Organisations

People	ORCID iD
Christine Evers (Principal Investigator)	http://orcid.org/0000-0003-0757-5504

Publications

Author Name Title

Publication Date Published

10 25 50

Collaboration
Engagement Activities


Description	UoS-NOCS Collaboration on Fibre Optic Sensing
Organisation	National Oceanography Centre
Country	United Kingdom
Sector	Academic/University
PI Contribution	Our team contributes expertise in acoustic signal processing and deep learning for audio. Two PhD students who are jointly supervised with Dr Belal (NOCS) are developing novel machine learning models for a) the disentanglement of acoustic cues embedded in mixture signals, and b) denoising of fibre optic sensing data.
Collaborator Contribution	NOCS provide world-renowned expertise in densely distributed data acquisition.
Impact	Identifying and distinguishing among events in the marine environment is an essential task in developing better understanding of climate change, and animal and human behaviour across 71% of the planet. Sources of ambient noise in the marine environment can be classified into natural (sediment flows, volcanic geo-hazards, etc.) and anthropogenic (ocean bottom trawling, offshore drilling, etc.). The aim of this research is to radically improve ocean observation and visualization capabilities, both for oceanographic research and for various marine sector applications of national and strategic importance. This research collaboration between NOC and the University of Southampton aims to combine expertise in densely distributed big-data acquisition and machine learning and AI techniques to characterise and automatically identify patterns in this data to aid human understanding of the environment. The key challenges in this project stem from the volume of streaming data generated and the lack of substantial quantities of labelled signals. This is a highly multi-disciplinary collaboration that cuts across marine science, physics, as well as machine learning & AI.
Start Year	2021


Description	UoS-Stanford CCRMA Collaboration on Auditory Modelling
Organisation	Stanford University
Country	United States
Sector	Academic/University
PI Contribution	Our team works closely with the Stanford University Center for Computer Research in Music and Acoustics. Our team develops novel foundation models based on deep learning for machine listening. The collaboration is focused on bio-inspired models. Our team develops the mathematical models and implements the software for training, testing and evaluating models for, e.g., sound event classification and detection of salient events.
Collaborator Contribution	Stanford University (Center for Computer Research in Music and Acoustics) contribute expertise, advice and guidance on auditory modelling.
Impact	The collaboration is focused on bio-inspired deep-learning based models for audio. The collaboration brings together expertise from auditory modelling and deep learning. The collaboration is ongoing and, to date, has led to the following outputs: - Dr Evers and Professor Slaney supervised a Year 4 Undergraduate Group Design Project that was conducted between October 2024 and January 2025. The project developed a live demonstrator for immersive audio technologies that alerts end-users of augmented/virtual reality devices of salient acoustic events that are within their physically surrounding space.
Start Year	2024


Description	ECS Engage Outreach
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Schools
Results and Impact	Approximately 40 students attended for a school visit to the University of Southampton. Dr Evers's team presented a live robot demonstrator that showcased the benefits and challenges of audio data and processing the data using machine learning models.
Year(s) Of Engagement Activity	2024


Description	Invited Seminar, Stanford University, Center for Computer Research in Music and Acoustics (CCRMA)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Approximately 15 researchers from across academia and companies in the Palo Alto area attended for an invited seminar on "Embodied Audio". The seminar sparked discussions about the various challenges presented in machine listening. The seminar led to an ongoing collaboration between Dr Evers and Professor Malcolm Slaney (Stanford University, CCRMA).
Year(s) Of Engagement Activity	2024
URL	https://ccrma.stanford.edu/events/christine-evers-embodied-audio

Abstract

Organisations

People

ORCID iD

Publications