Audiovisual integration of identity information from the face and voice: behavioural fMRI and MEG studies.

Lead Research Organisation: University of Glasgow
Department Name: School of Psychology

Abstract

Often when we hear a voice we create an image of the speaker which turns out to be quite wrong when we see the face. This is because our brain combines information from the face and voice of persons to make better estimates of a person's characteristics and allow better, more efficient social interactions. But how the brain combines information on a person's identity from her face and voice is poorly understood. In this project we combine a range of state-of-the-art techniques (face and voice morphing, magnetic resonance imaging, magnetoencephalography) to investigate the brain mechanisms of audiovisual integration of identity information. Participants will be played video clips showing a person saying a simple syllable in which the face and voice contain similar or different identity information (e.g., Bob's voice with Sam's face), and be asked to categorise the person's identity (e.g., Bob vs. Sam). While they perform this task, participants' brain activity will be measured with very high, millisecond and millimetre precision. The results will allow us to understand how two sensory modalities - audition and vision - are integrated in our brain during a task important in our everyday social interactions. They will advance our understanding of how the brain recognizes persons from their face and voice, with important potential outcome for hearing impaired persons and impact on the growing technology for automated person recognition.

Technical Summary

The cerebral mechanisms of person recognition through their face and voice are poorly understood. Much research has focused on audiovisual speech perception, but little scientific effort has been devoted to the multimodal integration of other types of paralinguistic, socially relevant information present in both voices and faces, such as identity. In this project we combine voice morphing, functional magnetic resonance imaging and magnetoencephalography to investigate the brain mechanisms of audiovisual integration of identity information. Participants will be played dynamic, synchronous video clips (a person saying a simple syllable) in which identity information contained in face and voice is independently and parametrically manipulated via 'video morphing'. While they perform different tasks on these stimuli (an implicit 1-back task, and an explicit identity categorisation task), participants' brain activity will be measured with very high, millisecond and millimetre precision, first in MEG, then in fMRI. Analyses will characterize the cortical architecture involved in integrating face and voice identity information. According to the latest developments in the field of multimodal integration, we will obtain converging evidence from several complementary analysis strategies: comparison of bimodal vs. unimodal stimuli; manipulation of stimulus informativeness (via morphing); manipulation of stimulus audiovisual congruence; comparison of implicit vs. explicit task; correlation with behavioural audiovisual gain. These complementary analyses are expected to bring important new knowledge on the cerebral bases of audiovisual integration in the non-linguistic domain, with important potential outcomes for hearing impaired persons and impact on the growing technology for automated person recognition.

Planned Impact

The research is likely to have significant impact for a wide range of academic communities in cognitive neuroscience and other domains as it touches several domains: auditory and visual perception and cognition, multimodal integration, social neuroscience, comparative cognition (cf. Academic beneficiaries). Outside of academia, the research will have potential benefit to several user groups in the longer term. Person with auditory perception deficits, such as cochlear implant patients and persons with hearing aids, increase their reliance on the visual modality in a way that could be maladaptive. The research we conduct in normal participants has the potential to be translated into clinical practice by providing mechanisms to enhance algorithms for auditory decoding in cochlear implants, or optimizing training and rehabilitation strategies. We have experience enhancing the impact of our work through links with industry. We have had funded collaborations with France -Telecom and with Cochlear, one of the leading cochlear implant manufacturers, and this research could lead to further similar collaborations in the longer term. Another potential pathway to impact in this research is to develop links with the growing industry of 'social computing'. Our results have the potential to be of interest for designers of software for automated person recognition from multiple sources (audio, video). Automated, artificial recognition of face alone, or voice alone, have been achieved to various degrees but by very different communities with quite different technologies, when computation problems posed by recognition in vision and audition are very similar in nature. Information on the solutions to this problem found by our brain over millions of years of evolution could potentially give important clues to the design of more parsimonious, and especially more robust to degradation, recognition systems. Links will the industry in this domain will be specifically sought for with assistance form Research and Enterprise. If during the course of this work exploitable intellectual property is created, colleagues in the University's Research and Enterprise Department will assess it, and where applicable will protect it and develop an exploitation plan such that any potentially valuable results obtained in the course of the research are exploited in order to provide a suitable return to the University and the researchers, and in a manner whereby such exploitation provides maximum benefit for the UK economy. Another important user group potentially affected by the research is the wider public. There is an enormous public and media interest in face and voice, social interactions. Therefore we have had in general an excellent response to our activities to engage the public in our work. The laboratory has engaged with the public at several occasions such as at the Glasgow Science Center during Brain Awareness Week. In addition to publishing in academic journals and presenting at Scientific Conferences, we engage with the press to improve the impact of our findings. The PI has received specific training with the media thanks to the BBSRC-organized Media Training Course, which has been instrumental in enhancing the profile of the laboratory's research (several journal and radio appearances in the past year, including the Sunday Telegraph and BBC4). The research will have an important impact on the career of the research assistant who will be performing the work, as well as on that of Ms Rebecca Watson, a BBSRC-funded PhD student. Ms Watson is performing her thesis work on the topic of audiovisual integration, and she has provided some of the pilot data included in the case for support. Both researchers will benefit from the research that will contribute to further extending their skills in two domains - vision and audition- using two methodologies - fMRI and MEG. Both will be involved in user engagement activities during the course of the research.

Publications

10 25 50