Phonetic design of overlapping speech in talk-in-interaction: A cross-linguistic study

Lead Research Organisation: University of Sheffield
Department Name: Human Communication Sciences

Abstract

Participants in a conversation often speak at the same time, leading to overlapping speech; indeed, it has been estimated that up to 13% of the speech we utter during normal conversation occurs simultaneously with that of another talker. Given that conversations are generally perceived to proceed smoothly, the high occurrence of overlap in conversation requires explanation. However, phonetic science has largely neglected overlapping talk, and its potential theoretical importance for our understanding of linguistic structure and function have hardly been explored. An understanding of overlapping talk is also important for a number of practical applications, including those in which human-computer interaction is achieved through a speech interface.

Conversations normally conform to a turn-taking model, in which participants wait for others to stop talking before talking themselves. However, overlapping speech can be used by conversational participants to compete for the turn - in other words, overlaps are initiated in order demonstrate that the overlapper is interested in taking over the turn immediately, not when the current speaker has finished. Conversations also contain a large number of non-competitive overlaps that have different conversational functions. A common example are so-called backchannel continuers such as 'uh-huh' that confirm the current talker's right to the turn.

Given that both competitive and non-competitive overlaps are well attested in conversation, the question arises as to what linguistic resources are employed by participants in order to display an overlap as turn competitive or as non-competitive. In answering this question, previous research has focused almost exclusively on the English language. There is therefore very little known about the similarities and differences between languages, with respect to overlapping talk, particularly in regard to its phonetic aspects. Also, previous work in the field has mainly used careful impressionistic listening rather than quantitative techniques.

Our proposal aims to address the limitations of previous work by conducting a cross-linguistic study of overlapping talk, combining quantitative with qualitative methods. We will record an audio/visual corpus of conversations held in two European languages - Standard Southern British English and Bosnian Serbo-Croatian. The conversational participants will be recorded in a naturalistic setting (such as a meeting room) in their home country. The corpus will allow us to determine whether the phonetic characteristics of overlapping talk are similar across different languages, and whether any differences relate to the particular system of accents and intonation used in the language. We will also use the corpus to determine whether overlapping talk that shares the same phonetic design in the two languages has a similar function.

The proposed project will also establish objective techniques for the analysis of overlapping talk in naturally occurring interactions. This will be achieved by exploiting recent developments in the field of speech technology relating to multi-channel recording and analysis of meetings, segregation of overlapping speech and pitch analysis of concurrent sounds. We anticipate that such techniques will allow us to extract phonetically relevant features from audio recordings of overlapping talkers in a naturalistic environment. A key issue, then, is whether analyses of overlapping talk based on objective acoustic analysis agree with previous analyses based on impressionistic listening.

Finally, we will make an annotated database of our audio/visual recordings available to other researchers via the Internet, and provide the facility for researchers to share their own annotations of the corpus. In this way, the corpus of recordings will be an evolving resource that will continue to benefit the research community beyond the life of the proposed two-year project.

Publications

10 25 50
 
Description Each of the original objectives is reproduced here, followed by a summary of work that has addressed it.

a) To record a corpus of overlapping talk, which will be made available to the research community in the form of annotated digital audio and video recordings. The corpus will be recorded both in Standard Southern British English and Bosnian Serbo Croatian to allow cross-linguistic comparison. For each language, meetings between four participants will be recorded in a naturalistic environment, and in their native country.


We have created a corpus of audio and video recordings of spontaneous, face-to-face multi-party conversation in the two languages. Freely available high quality recordings of mundane, non-institutional, multi-party talk are still sparse, and this corpus aims to contribute valuable data suitable for study of multiple aspects of spoken interaction. In particular, it constitutes a unique resource for spoken Bosnian Serbo-Croatian (BSC), an under-resourced language with no spoken resources available at present. This corpus was created in collaboration with colleagues at the University of Tuzla, Bosnia, under our direction. The corpus consists of around 4 hours of free conversation in each of the target languages, BSC and British English (BE). The audio recordings have been made on separate channels using head-set microphones, as well as using a microphone array, containing 8 omni-directional microphones. Both corpora have been segmented and transcribed using segmentation notions and transcription conventions developed from those of the conversation analysis research tradition. Furthermore, the transcriptions have been automatically aligned with the audio at the word and phone level, using the method of forced alignment. The transcriptions have been annotated in ELAN on a number of parameters, including: type of overlap (e.g. competitive vs. non-competitive) and non-verbal features (gesture etc). A further 'ethical' annotation was carried out in order to identify passages of the recordings. A system of traffic lights was used. 'Green' passages can readily be made available to the wider public. 'Amber' passages may contain references possibly identifying other people than the participants themselves. These could be made available following further clearance. 'Red' passages contain reference to other people that might be construed as defamatory, and will not be made available. This scheme has been used when deciding what parts of the corpus to make available to interested researchers.
b) To develop techniques for extracting phonetically relevant features from recordings of overlapping talkers in a naturalistic environment, by exploiting recent developments in the field of speech technology relating to multi-channel recording and analysis of meetings, segregation of overlapping speech and pitch analysis of concurrent sounds.

c) To determine whether analyses of overlapping talk based on objective acoustic analysis concur with previous analyses based on impressionistic listening.

These two issues are addressed in detail in our paper published in Speech Communication (Kurtic, Brown & Wells, 2013) and also in an article in Language and Speech (Gorisch, Wells & Brown, 2012). While that research was conducted using a different corpus, the ideas and techniques were largely developed while working with our new corpora. We plan to publish comparable analyses on the newly collected corpora in the near future.
d) To investigate the differences between the temporal and phonetic characteristics of overlapping talk in Standard Southern British English and Bosnian Serbo Croatian, and to relate these differences to properties of the specific systems of the language (e.g., accentual and intonational systems).

So far we have compared the use of pitch (more precisely, fundamental frequency / F0) by speakers of the two languages when involved in overlapping talk. Our preliminary observations relate to the person who is doing the overlapping / interrupting, i.e. the overlapper, and also to the person interrupted, i.e. the overlappee. We address the question of whether, and if so how, speakers differentiate overlaps that are competing from the floor, from non-competitive overlaps.
In both languages overlappees and overlappers differ in their use of pitch as an interactional resource in the service of turn competition. In both languages, in order to compete for the floor overlappers raise pitch, compared to their talk outside overlap. Thus there seems to be a similar system in the two languages. However, this does not entail that non-competitive overlaps are differentiated from competitive overlaps only by F0: For American English, Kurtic (2011) has shown that loudness (intensity) is also involved . We continue to investigate this in the two languages.
In both languages, overlappees do not routinely raise pitch to fend off competition. There is some evidence that in Bosnian, overlappees actually lower their pitch when returning competition; whereas English-speaking overlappees use lowering of their pitch to yield the floor. If this finding is substantiated, it will indicate a genuine cross-linguistic difference.
Our findings will contribute to theoretical discussion about universals of spoken interaction, and thus indirectly to debates about linguistic evolution. They will also be of interest to students of intercultural communication, and to researchers and practitioners in a range of applications, including speech and language pathology and spoken language systems for computers.
e) To investigate whether subtypes of overlapping talk (defined in terms of their phonetic design and placement of the overlap onset) have similar functions in Standard Southern British English and Bosnian Serbo Croatian.

While we have made some progress in understanding these issues in relation to our English corpus, this theme still remains to be explored in detail in Bosnian. We have developed a strong collaboration with Bosnia-based linguists, particularly B Aljukic, whose input will be invaluable in taking this topic forward in the near future.
Exploitation Route In what ways might your findings be taken forward or put to use by others? (200 words)

In research, the methods we have used to create our corpora are likely to influence the collection of corpora in the future where it is important to combine naturalness of the talk with high quality audio recordings. We hope that our emphasis on the theoretical importance of overlapping talk, as a window into the organisation of turn-taking , intonation systems and on-line speech processing by humans, will encourage future researchers to adopt a similar orientation when working on other languages and populations. We look forward to the integration of our work with work on non-verbal aspects of conversation, including gesture and eye-gaze, as well as with linguistic (morphosyntactic and lexical) analyses of real conversation, in order to achieve a comprehensive understanding of how people talk.
One aspect of the impact potential of this research, for people using a cochlear implant, has been explained in the 'impact narrative '. It can be envisaged that similar software applications could be developed for other users, including learners of English as a second language and possibly other populations with communication impairments that include issues with turn-taking, for example people with autism spectrum disorders.
Sectors Education,Healthcare

URL http://overlap.rcweb.dcs.shef.ac.uk
 
Description The main impact activity resulting from this project is in the project Meeting the Challenge of Overlapping Talk for Cochlear Implant Users. The aim of this project is to develop useful training materials for cochlear implant users to practice handling simultaneous or overlapping talk in conversation. Overlapping talk is known to be a particular problem for individuals who have a hearing loss, even when using a conventional hearing aid or cochlear implant. Until recently, even in one-to-one settings many users would need optimum conditions in order to hold a satisfactory conversation, e.g. a quiet environment and the communication awareness of both participants that they should avoid taking at the same time. Recent technological improvements in cochlear implant devices mean that it is now more realistic for users to attempt to engage in natural conversations in which overlapping talk is a common occurrence. However, currently there are no established training materials that hearing professionals can use to help cochlear implant users deal with the problem of simultaneous talk. In this project we are developing software based training materials that address this gap and promote key conversational competencies in cochlear implant users. Graded tasks enable users to repeatedly practise (i) crucial listening skills (identifying the main speaker, recognising the semantic content of the speech signal, and understanding the social action underlying the conversational exchange) and (ii) speaking skills fundamental to multi-party conversation (using competitive and non-competitive overlaps appropriately). These materials will draw mainly on the outputs from the earlier project funded by AHRC, where we have developed a unique corpus and some key findings about overlapping talk in conversation. To the best of our knowledge, this will be the first training software for hearing impaired users that specifically addresses the problems raised by overlapping talk. We are working with five cochlear implant users, who are trying out the training materials in a series of focus groups and giving us feedback. This is coordinated by audiology and speech and language therapy colleagues from the local cochlear implant service. The project is a collaboration between the University of Sheffield and NHS and will run from March 2014 to February 2015. It is funded by the Arts and Humanities Research Council under its Follow on Fund for Impact and Engagement scheme.
First Year Of Impact 2014
Sector Healthcare
Impact Types Societal

 
Description Emeritus fellowship
Amount £22,700 (GBP)
Funding ID EM 2015 017 
Organisation The Leverhulme Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 02/2016 
End 01/2018
 
Title A Corpus of Spontaneous Multi-Party Conversation in Bosnian Serbo-Croatian and British English. 
Description We have recorded an audio/visual corpus of conversations held in two European languages - Standard Southern British English and Bosnian Serbo-Croatian, totaling about 4 hours per language. There were four young adult participants for each language. The participants, who already knew each other, were recorded in a naturalistic setting in their home country. Each speaker was recorded on a separate audio channel, so that their talk can be analysed acoustically even when speaking in overlap. For a full description of the corpus, see Kurtic et al. (2012). All the recordings have been transcribed orthographically in ELAN and each instance of overlap has been annotated on a number of parameters. In due course we will make an annotated database of our audio/visual recordings available to other researchers via the Internet, and provide the facility for researchers to share their own annotations of the corpus. In this way, the corpus of recordings will be an evolving resource that will continue to benefit the research community beyond the life of the project. 
Type Of Material Database/Collection of data 
Year Produced 2013 
Provided To Others? Yes  
Impact In aour current AHRC follow-onproject we are developing software based training materials that promote key conversational competencies in cochlear implant users. We are designing graded tasks to enable users to repeatedly practise (i) crucial listening skills and (ii) speaking skills fundamental to multi-party conversation . These materials draw directly on the outputs from our earlier AHRC-funded project on overlapping talk, where we have developed a unique corpus and some key findings about overlapping talk in conversation. To the best of our knowledge, this will be the first training software for hearing implaired users that specifically addresses the problems raised by overlapping talk. Although inthis project we are working with English CI users, we have made use of our Bosnian corpus with them, to illustrate universal conversational practices. 
URL http://overlap.rcweb.dcs.shef.ac.uk/?page_id=6
 
Description Bosnian conversation: recording and transcription of a corpus 
Organisation University of Tuzla
Department Philosophy Faculty
Country Bosnia and Herzegovina 
Sector Academic/University 
PI Contribution Dr Emina Kurtic , research assistant at University of Sheffield and native speaker of Bosnian, travelled to Tuzla to oversee the recording of the corpus and to set up a team of transcribers recrutied from postgraduate students of linguistics at the University of Tuzla.
Collaborator Contribution Our partner helped to identify suitable particpants and transcribers and provided a venue and facilties for the recordings. This was invaluable. They also facilated publication of two articles authored /co-authored by Emina Kurtic inteh Bosnian linguistics journal Bozanski Jesik, outlining the method and approach to conversation research for a Bosnain readership.
Impact 1) A 4 hour audio-visual corpus of Bosnian conversation has been recorded, transcribed and annotated. This has been described in a peer-revew publication 2) Emina Kurtic has established a research collaboration with one of the Tuzla team, Bernes Aljukic. They have co-authored a conference paper and are working on a second publication. 3) Emina Kurtic has authored two artcles related to the joint project in the journal Bozanski Jesik. Publication details for the above are assocaited with this award on Research fish.
Start Year 2008
 
Description iCog Workshop: Turn-taking in conversation: a multi-disciplinary approach 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Following on from two keynote talks from different displinary prerspectives , Emina introduced the Sheffield Free Talk conversational dataset and suggested points for discussion. The workshop participants then gathered in small groups and spent an hour uncovering varying approaches to segmentation, turn-taking cues, and potential applications of the two datasets in their particular lines of research.

We received some very encouraging feedback from participants at the workshop, and are pleased that the day seemed to be received so well. We are not planning a formal follow-up to the workshop itself, though we will have an end-of-project event in February 2015 for when our current AHRC grant expires. Workshop participants are welcome to return at that stage to find out more about what progress we have made.
Year(s) Of Engagement Activity 2014
URL http://overlap.rcweb.dcs.shef.ac.uk/?p=141