An Edinburgh Speech Production Facility

Lead Research Organisation: Queen Margaret University

Department Name: Clinical Audiology Speech &Lang Res Cen

Abstract

The proposal is for a facility designed to record and analyse the movements of the lips, tongue, and jaw during spoken dialogue. This facility will be the first of its kind in the UK, and will be useful for applications in speech recognition and speech synthesis, as well as for developing theories of the cognitive representations and processes involved in normal and impaired speech production. The first output of the facility will be a database of recorded dialogue that will be useful for researchers interested in the relationships between speech movement and acoustics (important for speech technology applications), as well as in the particular types of pronunciations that speakers use during spontaneous dialogue.

Funded Value:

£82,388

Funded Period:

May 07 - Aug 10

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/E016359/1

Principal Investigator:

James Scobbie

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Cognitive Science Appl. in ICT (10%)

Human Communication in ICT (40%)

Image & Vision Computing (50%)

Organisations

Queen Margaret University (Lead Research Organisation)

People	ORCID iD
James Scobbie (Principal Investigator)
William John Hardcastle (Co-Investigator)
Robin Lickley (Co-Investigator)
Sonja Schaeffler (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

C Geng (2011) Between the regular and the particular in speech and language

Felps D (2012) Foreign Accent Conversion Through Concatenative Synthesis in the Articulatory Domain in IEEE Transactions on Audio, Speech, and Language Processing

Felps D. (2010) Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Geng C (2013) Recording speech articulation in dialogue: Evaluating a synchronized double electromagnetic articulography setup in Journal of Phonetics

Scobbie J (2013) The Edinburgh Speech Production Facility DoubleTalk Corpus

Scobbie JM (2011) A common co-ordinate system for mid-sagittal articulatory measurement

Scobbie JM (2011) Audible aspects of speech preparation

Key Findings
Impact Summary


Description	We created a publicly available corpus of speech recordings that includes synchronized articulatory and acoustic records of speech in dialogue, for free use. Our facility is available for further funded use; we offer calibration, gluing, recording, and data post-processing services. We commissioned the development of data analysis software, available through Articulate Instruments Ltd. The DoubleTalk articulatory speech corpus includes synchronised audio and articulatory trajectories for 12 speakers of English. The corpus was collected at the Edinburgh Speech Production Facility (ESPF) using two synchronized Carstens AG500 electromagnetic articulometers. The first release of the corpus comprises orthographic transcriptions aligned at phrasal level to EMA and audio data for each of 6 mixed-dialect speaker pairs. It is available from the ESPF online archive (http://espf.ppls.ed.ac.uk/frontend.php/project/espf-doubletalk). A variety of tasks were used to elicit a wide range of speech styles, including monologue (a modified Comma Gets a Cure and spontaneous story-telling), structured spontaneous dialogue (Map Task and Diapix), a wordlist task, a memory-recall task, and a shadowing task. To enable wider use of EMA data from the ESPF facility, Articulate Instruments Ltd produced an extra component of the AAA software programme specifically to handle EMA data. This commerically-available software, designed for articulatory speech analysis, therefore provides the opportunity for users familiar with other articulatory data to access and analyse EMA data without having to learn new software. In addition, the company's contribution to the design of the facility enables synchronised collection of EPG (electropalatography) data.
Exploitation Route	http://www.speech-graphics.com/ Speech Graphics have used EMA data for underpinning their acoustically lip-synched facial animation.
Sectors	Creative Economy Digital/Communication/Information Technologies (including Software) Other
URL	http://www.lel.ed.ac.uk/projects/ema/


Description	http://www.speech-graphics.com/ Speech Graphics have used EMA data as part of the underpinnings for their lip-synch animation, used in gaming and other creative industries.
First Year Of Impact	2013
Sector	Creative Economy,Digital/Communication/Information Technologies (including Software)
Impact Types	Cultural Economic

Abstract

Organisations

People

ORCID iD

Publications