COG-MHEAR: Towards cognitively-inspired 5G-IoT enabled, multi-modal Hearing Aids

Lead Research Organisation: Edinburgh Napier University
Department Name: School of Computing

Abstract

Currently, only 40% of people who could benefit from Hearing Aids (HAs) have them, and most people who have HA devices don't use them often enough. There is social stigma around using visible HAs ('fear of looking old'), they require a lot of conscious effort to concentrate on different sounds and speakers, and only limited use is made of speech enhancement - making the spoken words (which are often the most important aspect of hearing to people) easier to distinguish. It is not enough just to make everything louder!

To transform hearing care by 2050, we aim to completely re-think the way HAs are designed. Our transformative approach - for the first time - draws on the cognitive principles of normal hearing. Listeners naturally combine information from both their ears and eyes: we use our eyes to help us hear. We will create "multi-modal" aids which not only amplify sounds but contextually use simultaneously collected information from a range of sensors to improve speech intelligibility. For example, a large amount of information about the words said by a person is conveyed in visual information, in the movements of the speaker's lips, hand gestures, and similar. This is ignored by current commercial HAs and could be fed into the speech enhancement process. We can also use wearable sensors (embedded within the HA itself) to estimate listening effort and its impact on the person, and use this to tell whether the speech enhancement process is actually helping or not.

Creating these multi-modal "audio-visual" HAs raises many formidable technical challenges which need to be tackled holistically. Making use of lip movements traditionally requires a video camera filming the speaker, which introduces privacy questions. We can overcome some of these questions by encrypting the data as soon as it is collected, and we will pioneer new approaches for processing and understanding the video data while it stays encrypted. We aim to never access the raw video data, but still to use it as a useful source of information. To complement this, we will also investigate methods for remote lip reading without using a video feed, instead exploring the use of radio signals for remote monitoring.

Adding in these new sensors and the processing that is required to make sense of the data produced will place a significant additional power and miniaturization burden on the HA device. We will need to make our sophisticated visual and sound processing algorithms operate with minimum power and minimum delay, and will achieve this by making dedicated hardware implementations, accelerating the key processing steps. In the long term, we aim for all processing to be done in the HA itself - keeping data local to the person for privacy. In the shorter term, some processing will need to be done in the cloud (as it is too power intensive) and we will create new very low latency (<10ms) interfaces to cloud infrastructure to avoid delays between when a word is "seen" being spoken and when it is heard. We also plan to utilize advances in flexible electronics (e-skin) and antenna design to make the overall unit as small, discreet and usable as possible.

Participatory design and co-production with HA manufacturers, clinicians and end-users will be central to all of the above, guiding all of the decisions made in terms of design, prioritisation and form factor. Our strong User Group, which includes Sonova, Nokia/Bell Labs, Deaf Scotland and Action on Hearing Loss will serve to maximise the impact of our ambitious research programme. The outcomes of our work will be fully integrated, software and hardware prototypes, that will be clinically evaluated using listening and intelligibility tests with hearing-impaired volunteers in a range of modern noisy reverberant environments. The success of our ambitious vision will be measured in terms of how the fundamental advancements posited by our demonstrator programme will reshape the HA landscape over the next decade.

Planned Impact

Significant impact beyond the academic environment is envisaged through this multi-disciplinary programme:

*Impact on people with hearing loss*
Over 10 million people in the UK (~350 million worldwide) currently suffer from debilitating hearing loss, at a cost of ~£450M/year to the NHS, and this figure is expected to rise to 14.5 million by 2031. People with serious hearing loss often find themselves socially isolated with a range of adverse health consequences. Even a modest improvement in hearing however, can have a significant impact on an individual's social and work life. Our proposed technologies will transform real-time, privacy-preserving and domain-independent learning capabilities, to deliver robust speech intelligibility enhancement and end-user cognitive load management, in the hearing aids (HAs) of 2050. Our technical work programme is focused on this contribution, and the wide number of released societal and individual benefits that follow from it. For example, the data we can obtain from our pilot (on/off-chip) HA fitting and clinical validation, in smart assistive care homes and other real-life environments, could potentially enable: remote fitting, and usage training of HAs for end-users and audiologists - resulting in resource savings and relevance in developing countries. In care homes, where hearing loss affects ~90%, a well-functioning communication channel (even by remote communication) in which the emotional state can be securely sensed and transported, would be an ambitious clinically relevant use case. This would also benefit the visually impaired as it complements the visual processing in speech perception.

*Hearing aid industry*
Our proposed audio-visual (AV) HAs can have a considerable impact on the HA industry, as demand for future AV aids should rapidly complement inferior Audio-only devices. The UK's global reputation in hearing research could thus be transformed simulating major global HA manufacturing. There are clear precedents or hearing science rapidly transforming hearing technology, e.g. multiple microphone processing and frequency compression have been commercialised to great effect. COG-MHEAR foresees AV processing as the next timely step forward, as previous barriers to AV processing are being overcome: wireless 5G and Internet of Things (IoT) technologies can free computation from having to be performed on the device itself, and wearable computing devices are becoming powerful enough to perform real-time face tracking and feature extraction. AV HAs will also impact on industry standards for HA evaluation and clinical standards for hearing loss assessment. Plans for realising industrial impacts are detailed in the Pathways to Impact and Workplan.

*Applications beyond hearing aids*
We foresee impact in several areas (see Impact Pathways), including cochlear implant signal processing, automatic speech recognition systems, multisensory integration, general auditory systems engineering, and clinical, computational, cognitive and auditory neuroscience. Beyond HAs, novel multimodal ecological momentary assessment tools could be developed, transforming existing sparse, unimodal commercial systems of our User Group members, e.g. Sonova. These could be exploited to personalise the design and usability of other medical instruments to enhance personal product experience. Our proposed wireless-based emotion detection system could extend to emotion-sensitive robotic assistants/companions, that could be of interest to smart care homes. Beyond health, our research will deliver a step change in the critical mass of UK engineering and physical science skills to tackle emerging challenges in signal processing. The potential of our disruptive technology can be exploited in teleconferencing and extremely noisy environments e.g. dynamic environments and situations where ear defenders are worn, such as emergency and disaster response and battlefield environments.

Publications

10 25 50