Towards a multisensory hearing aid: Engineering synthetic audiovisual and audiotactile signals to aid hearing in noisy backgrounds
Lead Research Organisation:
Imperial College London
Department Name: Bioengineering
Abstract
There are more than 10 million people in the U.K., one in six, with some form of hearing impairment. The only assistive technology currently available to them are hearing aids. However, they can only aid people with a particular type of hearing impairment, and hearing aid users still have major problems with understanding speech in noisy backgrounds. A lot of effort has therefore been devoted on signal processing to reduce the background noise in complex sounds, but this has not yet been able to significantly improve speech intelligibility.
The research vision of this project is to develop a radically different technology for assisting people with hearing impairments to understand speech in noisy environments, namely through simplified visual and tactile signals that are engineered from a speech signal and that can be presented congruently to the sound. Visual information such as lip reading can indeed improve speech intelligibility significantly. Haptic information, such as through a listener touching the speakers face, can enhance speech perception as well. However, touching a speakers face in real life is often not an option, and lip reading is often not available such as when a speaker is too far or not in the field of view. Moreover, natural visual and tactile stimuli are highly complex and difficult to substitute when they are not available naturally.
In this project I will engineer simplistic visual and tactile signals from speech that will be designed to enhance the neural response to the rhythm of speech and thereby its comprehension. This builds on recent breakthroughs in our understanding of the neural mechanisms for speech processing. These breakthroughs have uncovered a neural mechanism by which neural activity in the auditory areas of the brain tracks the speech rhythm, set by the rates of syllables and words, and thus parses speech into these functional constituents. Strikingly, this speech-related neural activity can be enhanced by visual and tactile signals, improving speech comprehension. These remarkable visual-auditory and somatosensory-auditory interactions thus open an efficient and non-invasive way of increasing the intelligibility of speech in noise through providing congruent visual and tactile information.
The required visual and tactile stimuli need to be engineered to efficiently drive the cortical response to the speech rhythm. Since the speech rhythm is evident in the speech envelope, a single temporal signal, either from a single channel or a few channels (low density) will suffice for the required visual and tactile signals. They can therefore later be integrated with non-invasive wearable devices such as hearing aids. Because this multisensory speech enhancement will employ existing neural pathways, the developed technology will not require training and will therefore be able to benefit young and elderly people alike.
My specific aims are (1) to engineer synthetic visual stimuli from speech to enhance speech comprehension,
(2) to engineer synthetic tactile stimuli from speech to enhance speech comprehension, (3) to develop a computational model for speech enhancement through multisensory integration, (4) to integrate the engineered synthetic visual and tactile stimuli paired to speech presentation, and (5) to evaluate the efficacy of the developed multisensory stimuli for aiding patients with hearing impairment. I will achieve these aims by working together with six key industrial, clinical and academic partners.
Through inventing and demonstrating a radically new approach to hearing-aid technology, this research will lead to novel, efficient ways for improving speech-in-noise understanding, the key difficulty of people with hearing impairment. The project is excellently aligned with the recently founded Centre for Neurotechnology at Imperial College, as well as more generally with the current major U.S. and E.U. initiatives on brain research.
The research vision of this project is to develop a radically different technology for assisting people with hearing impairments to understand speech in noisy environments, namely through simplified visual and tactile signals that are engineered from a speech signal and that can be presented congruently to the sound. Visual information such as lip reading can indeed improve speech intelligibility significantly. Haptic information, such as through a listener touching the speakers face, can enhance speech perception as well. However, touching a speakers face in real life is often not an option, and lip reading is often not available such as when a speaker is too far or not in the field of view. Moreover, natural visual and tactile stimuli are highly complex and difficult to substitute when they are not available naturally.
In this project I will engineer simplistic visual and tactile signals from speech that will be designed to enhance the neural response to the rhythm of speech and thereby its comprehension. This builds on recent breakthroughs in our understanding of the neural mechanisms for speech processing. These breakthroughs have uncovered a neural mechanism by which neural activity in the auditory areas of the brain tracks the speech rhythm, set by the rates of syllables and words, and thus parses speech into these functional constituents. Strikingly, this speech-related neural activity can be enhanced by visual and tactile signals, improving speech comprehension. These remarkable visual-auditory and somatosensory-auditory interactions thus open an efficient and non-invasive way of increasing the intelligibility of speech in noise through providing congruent visual and tactile information.
The required visual and tactile stimuli need to be engineered to efficiently drive the cortical response to the speech rhythm. Since the speech rhythm is evident in the speech envelope, a single temporal signal, either from a single channel or a few channels (low density) will suffice for the required visual and tactile signals. They can therefore later be integrated with non-invasive wearable devices such as hearing aids. Because this multisensory speech enhancement will employ existing neural pathways, the developed technology will not require training and will therefore be able to benefit young and elderly people alike.
My specific aims are (1) to engineer synthetic visual stimuli from speech to enhance speech comprehension,
(2) to engineer synthetic tactile stimuli from speech to enhance speech comprehension, (3) to develop a computational model for speech enhancement through multisensory integration, (4) to integrate the engineered synthetic visual and tactile stimuli paired to speech presentation, and (5) to evaluate the efficacy of the developed multisensory stimuli for aiding patients with hearing impairment. I will achieve these aims by working together with six key industrial, clinical and academic partners.
Through inventing and demonstrating a radically new approach to hearing-aid technology, this research will lead to novel, efficient ways for improving speech-in-noise understanding, the key difficulty of people with hearing impairment. The project is excellently aligned with the recently founded Centre for Neurotechnology at Imperial College, as well as more generally with the current major U.S. and E.U. initiatives on brain research.
Planned Impact
About 16% of the adult population in the U.K. suffers from hearing impairment. Understanding speech in noise is the biggest problem that most of them face in everyday listening situations, even when wearing hearing aids. This project has the potential to revolutionize hearing aids by integrating the audio signal with synthetic visual and tactile signals to enhance speech-in-noise perception. This will therefore have a large impact on health and quality of life in the U.K.. Through providing a Proof of Concept for multisensory hearing aids, the project will also benefit the U.K.'s high-technology and medical device industry. Specifically the research will benefit the following groups:
(iii) People with mild to moderate sensorineural hearing loss. Mild to moderate sensorineural hearing loss is widespread, particularly amongst elderly people. Although it can be alleviated with hearing aids, afflicted individuals nevertheless retain significant difficulties in understanding speech in challenging listening environments such as background noise. The work proposed here will provide a Proof of Concept for multisensory hearing aids that can significantly enhance the comprehension of speech in noise. This Proof of Concept and the further development of the technology into wearable devices will significantly boost the quality of life for people with sensorineural hearing loss.
(i) People with auditory processing disorder. Auditory processing disorder leads to major problems with understanding speech in noise, and can severely impact a person's social and economic development. There is currently no treatment or rehabilitation, and current assessments are only based on behavioral tests. My research will yield a novel way to enhance the comprehension of speech in noise for patients with auditory processing disorder. This will greatly aid people with this disorder to succeed in real-world environments.
(iv) High-tech and medical device industry. The research will engineer synthetic visual and tactile stimuli from speech for enhancing the comprehension of speech in noise. This will establish a Proof of Concept for multisensory hearing aids that will spark further development by high-tech and medical device companies that work on wearable devices. Two important industrial partners, Google Research and Oticon, a world-leading Danish hearing-aid manufacturer, are therefore already project partners. I will work with them to ensure the further development and commercialization of the developed technology. IP obtained from the project will be made available to these companies and others through licensing.
(v) Healthcare professionals. The developed technology will have significant impact on audiologist and ENT doctors. As set out above, treatments of sensorineural hearing loss and auditory processing disorder are currently limited to hearing aids, but fail to enhance speech in noise. The latter will be achieved through this project using multisensory stimulation, which will transform and improve the services provided by audiologist and ENT doctors.
(iii) People with mild to moderate sensorineural hearing loss. Mild to moderate sensorineural hearing loss is widespread, particularly amongst elderly people. Although it can be alleviated with hearing aids, afflicted individuals nevertheless retain significant difficulties in understanding speech in challenging listening environments such as background noise. The work proposed here will provide a Proof of Concept for multisensory hearing aids that can significantly enhance the comprehension of speech in noise. This Proof of Concept and the further development of the technology into wearable devices will significantly boost the quality of life for people with sensorineural hearing loss.
(i) People with auditory processing disorder. Auditory processing disorder leads to major problems with understanding speech in noise, and can severely impact a person's social and economic development. There is currently no treatment or rehabilitation, and current assessments are only based on behavioral tests. My research will yield a novel way to enhance the comprehension of speech in noise for patients with auditory processing disorder. This will greatly aid people with this disorder to succeed in real-world environments.
(iv) High-tech and medical device industry. The research will engineer synthetic visual and tactile stimuli from speech for enhancing the comprehension of speech in noise. This will establish a Proof of Concept for multisensory hearing aids that will spark further development by high-tech and medical device companies that work on wearable devices. Two important industrial partners, Google Research and Oticon, a world-leading Danish hearing-aid manufacturer, are therefore already project partners. I will work with them to ensure the further development and commercialization of the developed technology. IP obtained from the project will be made available to these companies and others through licensing.
(v) Healthcare professionals. The developed technology will have significant impact on audiologist and ENT doctors. As set out above, treatments of sensorineural hearing loss and auditory processing disorder are currently limited to hearing aids, but fail to enhance speech in noise. The latter will be achieved through this project using multisensory stimulation, which will transform and improve the services provided by audiologist and ENT doctors.
Organisations
- Imperial College London (Fellow, Lead Research Organisation)
- Google (United States) (Project Partner)
- Imperial College Healthcare NHS Trust (Project Partner)
- University College London (Project Partner)
- Oticon (Denmark) (Project Partner)
- Ruhr University Bochum (Project Partner)
- Sorbonne University (Project Partner)
People |
ORCID iD |
Tobias Reichenbach (Principal Investigator / Fellow) |
Publications
Saiz-Alía M
(2021)
Otoacoustic Emissions Evoked by the Time-Varying Harmonic Structure of Speech.
in eNeuro
Vanheusden FJ
(2020)
Hearing Aids Do Not Alter Cortical Entrainment to Speech at Audible Levels in Mild-to-Moderately Hearing-Impaired Subjects.
in Frontiers in human neuroscience
Keshavarzi M
(2020)
Transcranial Alternating Current Stimulation With the Theta-Band Portion of the Temporally-Aligned Speech Envelope Improves Speech-in-Noise Comprehension.
in Frontiers in human neuroscience
Kegler M
(2022)
The neural response at the fundamental frequency of speech is modulated by word-level acoustic and linguistic information.
in Frontiers in neuroscience
Varano E
(2022)
Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans
in Frontiers in Neuroscience
Kadir S
(2020)
Modulation of Speech-in-Noise Comprehension Through Transcranial Current Stimulation With the Phase-Shifted Speech Envelope.
in IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society
Weissbart H
(2020)
Cortical Tracking of Surprisal during Continuous Speech Comprehension.
in Journal of cognitive neuroscience
Etard O
(2022)
No Evidence of Attentional Modulation of the Neural Response to the Temporal Fine Structure of Continuous Musical Pieces.
in Journal of cognitive neuroscience
Guilleminot P
(2023)
Audiotactile Stimulation Can Improve Syllable Discrimination through Multisensory Integration in the Theta Frequency Band
in Journal of Cognitive Neuroscience
Thornton M
(2022)
Robust decoding of the speech envelope from EEG recordings through deep neural networks.
in Journal of neural engineering
Kulkarni A
(2021)
Effect of visual input on syllable parsing in a computational model of a neural microcircuit for speech processing.
in Journal of neural engineering
Saiz-Alía M
(2020)
Computational modeling of the auditory brainstem response to continuous speech.
in Journal of neural engineering
Etard O
(2019)
Decoding of selective attention to continuous speech from the human auditory brainstem response.
in NeuroImage
Kegler M
(2021)
Modelling the effects of transcranial alternating current stimulation on the neural encoding of speech in noise.
in NeuroImage
Saiz-Alia M.
(2019)
Selective attention in the brainstem and speech-in-noise comprehension
in Proceedings of the International Congress on Acoustics
Reichenbach T.
(2019)
Decoding the neural processing of selective attention to speech
in Proceedings of the International Congress on Acoustics
Guilleminot P
(2022)
Enhancement of speech-in-noise comprehension through vibrotactile stimulation at the syllabic rate.
in Proceedings of the National Academy of Sciences of the United States of America
Saiz-Alía M
(2019)
Individual differences in the attentional modulation of the human auditory brainstem response to speech inform on speech-in-noise deficits.
in Scientific reports
Sumner L
(2021)
Steady streaming as a method for drug delivery to the inner ear.
in Scientific reports
Etard O
(2019)
Neural Speech Tracking in the Theta and in the Delta Frequency Band Differentially Encode Clarity and Comprehension of Speech in Noise.
in The Journal of neuroscience : the official journal of the Society for Neuroscience
Keshavarzi M
(2021)
Cortical Tracking of a Background Speaker Modulates the Comprehension of a Foreground Speech Signal.
in The Journal of neuroscience : the official journal of the Society for Neuroscience
Varano E
(2023)
AVbook, a high-frame-rate corpus of narrative audiovisual speech for investigating multimodal speech perception
in The Journal of the Acoustical Society of America
BinKhamis G
(2019)
Speech Auditory Brainstem Responses in Adult Hearing Aid Users: Effects of Aiding and Background Noise, and Prediction of Behavioral Measures.
in Trends in hearing
Description | We have found that synthetically generated facial animations can siginificantly improve the understanding of speech in noise. Moreover, we found that small vibrations that are timed to the syllable rhythm in speech can aslo significantly enhance speech comprehension. We have been able to link these multisensory benefits to multisensory integration in the auditory cortex. |
Exploitation Route | The finding may inform the design of other intervention methods to aid people with hearing impairment to better understand speech in background noise. |
Sectors | Electronics Healthcare Pharmaceuticals and Medical Biotechnology |