Speaker-controlled variability in connected discourse: acoustic-phonetic characteristics and impact on speech perception

Lead Research Organisation: University College London
Department Name: Phonetics and Linguistics

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
 
Description We recorded the speech of pairs of participants while they completed a 'spot the difference task' ('diapix') in good listening conditions or when one of the participants found it hard to hear the other due to a simulated cochlear implant (VOC), the presence of babble noise (BAB) or because s/he was not a native English speaker (L2). We analysed the acoustic-phonetic characteristics of the speech produced by 40 speakers (while hearing normally) when interacting with interlocutors experiencing these three different types of communication barrier (VOC, BABBLE, L2). We found that the acoustic-phonetic enhancements made were tailored to counteract a specific adverse condition. Relative to their conversational speech, talkers spoke with higher and more varied pitch range and greater mid-frequency energy when clarifying their speech for an interlocutor hearing them in noise, but did not change these characteristics when clarifying their speech for an interlocutor hearing them via a noise-excited vocoder, where most pitch information is lost and audibility thresholds not an issue. Talkers made these specific adjustments even though not directly experiencing the adverse listening condition As proposed by Lindblom's Hyper-Hypo theory, speech production appeared to be guided by the needs of the listener to the cost of greater effort for the speaker. Perception experiments showed a significant link between improvement in clarity ratings between the casual and VOC condition and degree of acoustic-phonetic change. Talkers' clarity measures were significantly correlated across different diapix conditions: although talkers became clearer in the VOC and BABBLE conditions, their clarity 'ranking' remained consistent relative to their inherent clarity in the 'no barrier' condition.



We also compared the type of clear speech naturally elicited due to communicative needs with clear speech obtained when participants were instructed to read sentences clearly. Clear read speech was found to have more extreme acoustic-phonetic enhancements than clear spontaneous speech, as more consistently hyper-articulated.



We investigated the relation between the internal structure of phonetic categories and consonant intelligibility for two phonetic contrasts. Measures of cross-category distance (CCD) (i.e., the difference between two sound categories) and within-category dispersion (WCD) (i.e., of the variability within each sound category) were obtained using 32 iterations per category for each of 40 speakers. These measures varied substantially across talkers but were not correlated across contrasts suggesting a lack of within-talker consistency in CCD and WCD. Consonant identification data for eight talkers presenting extremes of CCD or WCD revealed some talker effects on reaction time, but these were not correlated with either of the two measures. We found no significant correlations between CCD or WCD measures and broader measures of clarity or communication effectiveness in the diapix tasks. We conclude that the conclusions of Newman et al (2001) were premature and that a talker's consistency of production or contrast salience does not appear to be directly correlated with their inherent intelligibility.
Exploitation Route The diapixUK materials that we developed for this project could be further developed for clinical use by speech and language therapists. They are appropriate for use with children as well as adults, and a simple measure of transaction time provides a general measure of communication efficiency. The speech that is recorded can be analysed to investigate aspects of communication such as clarification requests, repair strategies, lexical diversity. The complete corpus of speech recordings and aligned orthographic transcriptions is available online (LUCID corpus), and is made available to other researchers on request.

The diapixUK picture materials are also available for use by other researchers. They have been constructed so that they can easily be adapted for other research purposes. They have now been adapted for use in a number of different languages and are widely been used in speech sciences research as can be seen by frequent requests for high resolution versions of these materials and frequent reference to these materials in major speech science and technology conferences such as Interspeech.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare

URL https://valeriehazan.com/wp/index.php/overview-2/
 
Description The use of the diapix method, developed to collect speaker/listener interactions in good and adverse conditions is being investigated as a clinical tool to look at communication efficiency in children and adults with language or hearing impairments.
First Year Of Impact 2016
Sector Digital/Communication/Information Technologies (including Software),Education,Healthcare
Impact Types Societal

 
Description Clear speech strategies : how auditory and visual cues are weighted
Amount $13,450 (AUD)
Funding ID 20211.71719 [ORS] 
Organisation Western Sydney University 
Sector Academic/University
Country Australia
Start 09/2011 
End 12/2012
 
Description Speaker-controlled variability in children's speech in interaction
Amount £307,179 (GBP)
Funding ID ES/I02896X/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 06/2011 
End 05/2014
 
Title LUCID corpus 
Description The LUCID (London UCL Clear speech in interaction) corpus contains all of the speech recordings made for this project as well as word-aligned orthographic transcriptions. The corpus is available to other researchers for non-commercial purposes as part of the online OSCAAR resource on password request. 
Type Of Material Database/Collection of data 
Year Produced 2011 
Provided To Others? Yes  
Impact This corpus and the related picture materials have now been used by a number of other researchers who have published work based on the use of Diapix at major international conferences such as Interspeech and in journal articles. 
URL http://oscaar.ling.northwestern.edu/collection.php?c=19
 
Description British library event - Do you hear what i hear? 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Our demonstration of the Diapix task was included in a public event at the British Library which aimed to highlight how our brain perceives and interprets speech and music.

This event gave use valuable experience which led us to then submit a successful proposal for inclusion in the Royal Society Summer Exhibition.
Year(s) Of Engagement Activity 2010
URL http://www.bl.uk/reshelp/experthelp/science/eventsandprojects/eventsummaries.html#brainandsound
 
Description Invited lecture: Clear speech strategies in speaker-listener interactions 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Invited lecture - Research colloquium series, MARCS Laboratories, University of Western Sydney, Sydney, Australia.



This lecture gave an overview of our work on this project.

Discussions that took place during my visit to MARCS led to a successful ESRC grant application.
Year(s) Of Engagement Activity 2011
 
Description Invited lecture: What can speaker-listener interaction tell us about speech perception in adverse listening conditions? 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact This was an invited presentation at a workshop on psycholinguistic approaches to speech perception in adverse listening conditions. This presentation publicised our ESRC project and also our newly-developed diapixUK task that can be used for recording corpora of spontaneous speech

Other labs have started using our diapixUK materials for their recordings
Year(s) Of Engagement Activity 2010
 
Description Media - Radio 4 interview about our project 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Interview with Valerie Hazan on Material World, BBC Radio 4 about the Noisy World exhibit at the Royal Society Summer Exhibition 2011 which included a demonstration linked to work developed within this research project. The interview is available online and includes a demonstration of our diapix task.

This publicised our project at a crucial time and aided our participant recruitment.
Year(s) Of Engagement Activity 2011
URL http://www.bbc.co.uk/programmes/b01292vf
 
Description Noisy world exhibit - Royal Society summer exhibition 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Our modern world is full of noise: from machinery, transport, radio & television, MP3 players, and other people talking. This exhibit showed how background noises of all types affect our ability to understand speech, and makes it harder for us to think about and remember things.

The exhibit involved a live demonstration of the diapix task and test conditions that we used in our project to record interactions between two speakers in good and adverse conditions.

Our work received great exposure as the result of our inclusion in the RS Summer Exhibition. It was features on Radio 4 and the website and other materials developed for the Summer Exhibition were very useful tools for further dissemination. We also recruited some participants who found out about our work at the exhibition.
Year(s) Of Engagement Activity 2011
URL http://royalsociety.org/summer-science/2011/noisy-world/