Speech Perception under Cognitive Load

Lead Research Organisation: University of York
Department Name: Psychology

Abstract

Most theories of human speech perception are derived from tasks performed in a quiet environment and under conditions of undivided attention. However, in the past few years, there has been a surge of interest in modelling speech recognition in more realistic conditions (e.g., noisy background, accented speech). However, among these realistic conditions, those resulting from a cognitive load have received little attention. Here, we define cognitive load (CL) as any listening challenges arising not from a distortion of the speech signal but from the recruitment of processing resources due to concurrent attentional or mnemonic demands. For example, what are the consequences of monitoring cockpit instruments on a pilot's ability to follow spoken instructions from ground control? The disruptive effect of CL on speech perception is noticed as early as in the initial stages of acoustic encoding. Under some circumstances, CL can even lead to a form of transient hearing impairment called inattentional deafness. Despite the obvious implications that these results have for theory and clinical practice, little is known about the low-level mechanisms by which CL interferes with speech perception. The aim of this proposal is to address this issue in three interconnected research streams drawing upon psychometric and identification paradigms.

The first stream asks whether CL affects all acoustic dimensions of speech equally. This question is important because not all acoustic dimensions are equally crucial for communication. For example, successful word recognition is more resilient to pitch distortions than duration distortions. The idea that CL affects some dimensions more than others is motivated by the claim that CL (e.g., a concurrent visual task) causes listeners to rapidly shift attention back and forth between the speech signal and the CL task, leading to an underestimation of the duration of the speech signal. If this hypothesis is correct, CL should lead primarily to a distortion of auditory temporal judgements and leave other core dimensions (loudness, pitch, and spectral structure) unaffected. This will be contrasted with the claim that CL leads to a general reduction in auditory precision across all acoustic dimensions.

The second stream investigates whether the format of the CL stimuli affects the severity of the CL interference. For example, is speech perception more affected by a concurrent task that requires rehearsing words silently (phonological format) or by a task that requires processing visual stimuli (visual format)? These experiments will address the debate between modal and amodal views of the processing resources used during speech perception.

The third stream aims to distinguish two potential mechanisms behind CL interference: Encoding and maintenance. Encoding is the process of converting a sensory input into mental representations. Maintenance is the process of preserving these representations in memory. Encoding of the CL stimuli will be manipulated such that it takes place either during or before the speech stimuli, hence pitting encoding against maintenance as the mechanism underlying interference. An encoding hypothesis predicts that only simultaneous encoding of speech and CL stimuli should lead to CL effects.

In order to explore the generalisability of the above phenomena beyond the speech domain, the effect of CL will be tested on both speech and non-speech sounds. This comparison will situate our findings within the long-standing debate on the existence of a specialised speech mode for sound perception.

Finally, because the notion of "cognitive listening" is becoming central not only in speech research but also in hearing practice, we will engage with clinical audiologists and discuss ways of including a cognitive component into standard pure-tone audiometric (PTA) and advise on potential phase-II clinical trials.

Planned Impact

By studying adverse conditions that place demands on cognitive resources without affecting the integrity of the speech signal itself, this proposal will open new directions in such emerging disciplines as Auditory Cognitive Science, Cognitive Hearing Science, and Cognitive Audiology. The primary outcome of this proposal will be theoretical. The proposed work will specify with high accuracy the mechanisms responsible for the interference caused by cognitive load on low-level speech perception. It will situate the findings within the influential debates on the specialisation of the auditory system for speech vs non-speech sounds, the modal vs amodal views of attentional control, and the mechanisms underlying the allocation of cognitive resources.

The impact of this proposal goes far beyond theory, however. The results will have the potential to address important applied questions about the effect of adverse conditions on the quality of everyday speech communication. In particular, human factors, public safety, and clinical audiology can benefit from strong evidence-based research on the effects of dual-tasking on speech recognition. For example, an understanding of the effect of monitoring flight-deck instruments (i.e., a cognitive load) on the effectiveness of speech communication between pilots and air-traffic controllers has obvious implications for aviation safety. Of particular interest in that context is the extent to which individual differences on standard psychometric tasks can predict poor speech perception under load.

Furthermore, as described in the Pathway to Impact section, we will take concrete steps to bridge basic research and clinical audiology. Our results will be used to investigate ways of increasing the ability of standard pure-tone audiometric (PTA) tests to predict the extent of difficulties experienced by listeners in natural listening conditions, such as speech in noise. Specifically, we will consider whether adding a cognitive component to PTA tests, e.g., performing PTA under divided attention, provides a better indicator of listening difficulties than PTA alone. This will be done through collaborative work with clinical audiologists and discussions about undertaking phase-II clinical trials after the period of funding.

Publications

10 25 50
 
Description We published an article in the Journal of the Acoustical Society of America showing that dual-tasking negatively impacts on speech perception by raising cognitive load. Previous research has shown that cognitive load increases reliance on lexical knowledge and decreases reliance on phonetic detail. Less is known about the effect of cognitive load on the perception of acoustic dimensions below the phonetic level. This study tested the effect of cognitive load on the ability to discriminate differences in duration, intensity, and fundamental frequency of a synthesized vowel. A psychophysical adaptive procedure was used to obtain just noticeable differences (JNDs) on each dimension under load and no load. Load was imposed by N-back tasks at two levels of difficulty (1-back, 2-back) and under two types of load (images, nonwords). Compared to a control condition with no cognitive load, all N-back conditions increased JNDs across the three dimensions. JNDs were also higher under 2-back than 1-back load. Nonword load was marginally more detrimental than image load for intensity and fundamental frequency discrimination. Overall, the decreased auditory acuity demonstrates that the effect of cognitive load on the listening experience can be traced to distortions in the perception of core auditory dimensions.
Exploitation Route Our results provide concrete evidence that CL reduces the ability to estimate the duration and intensity of sound stimuli, similar to the effect of hearing impairement. These results are important for theoretical questions in cognitive listening sciences, but they also touch on important issues in audiological practice. In particular, pure-tone audiometry (PTA) has met limited success in predicting the difficulties that listeners experience in natural listening conditions such as speech in noise. In comparison, psychometric tests like those included in this proposal have proved to be better predictors. Therefore, a PTA test that has a cognitive component built into it (e.g., perform PTA under CL) may be a better indicator of speech in noise difficulty than PTA alone. Further research exploring this possibility could be taken up by audiologists in an attempt to improve hearing-impairement diagnosis.
Sectors Education,Healthcare

 
Description We are continuing to investigate whether our results on the effect of cognitive load on speech perception extend to participants with a hearing loss. The comparison between normal-hearing and hearing-impaired participants is important because it will allow us to determine the effect of long-term input deprivation on the the contribution of cognitive processes (e.g., selective attention) to low-level hearing. Testing in collaboration with audiologists in the audiology department at the NHS hospital in York is ongoing.
Sector Healthcare
 
Description Research grants, new investigator (Ronan McGarrigle)
Amount £228,835 (GBP)
Funding ID ES/R003572/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 10/2018 
End 02/2021
 
Description Collaboration with Emma Mills (York Teaching Hospital NHS Foundation Trust) 
Organisation York Teaching Hospital NHS Foundation Trust
Country United Kingdom 
Sector Public 
PI Contribution Emma Mills and my team are currently trying to extend the results published in JASA 2019 and E&A 2018 to a population of young hearing-impaired listeners.
Collaborator Contribution Emma Mills and my team are currently trying to extend the results published in JASA 2019 and E&A 2018 to a population of young hearing-impaired listeners.
Impact Data collection ongoing.
Start Year 2019
 
Description Collaboration with Ronan McGarrigle 
Organisation University of Bradford
Country United Kingdom 
Sector Academic/University 
PI Contribution Introduced the topic of divided attention as a source of adverse conditions during active listening.
Collaborator Contribution Introduced the idea that we should consider measuring effort, alongside performance, using pupillometry.
Impact Various papers with McGarrigle and collaboration on subsequent grant.
Start Year 2018