Intonational Variation in Arabic

Lead Research Organisation: University of York

Department Name: Language and Linguistic Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Funded Value:

£324,158

Funded Period:

Apr 11 - Jun 17

Funder:

ESRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

ES/I010106/1

Principal Investigator:

Samantha Hellmuth

Research Subject:

Linguistics (100%)

Research Topic:

Linguistics (General) (100%)

Organisations

People	ORCID iD
Samantha Hellmuth (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Almbark R (2019) Is there an interlanguage intelligibility benefit in perception of English word stress? in Loquens

Almbark, R. (2014) Acquiring the phonetics and phonology of English word stress: Comparing learners from different L1 backgrounds.

Alzamil A (2021) The prosodic realisation of focus in Saudi Arabic dialects in comparative perspective

Bouchhioua N (2019) Studies on Arabic Dialectology and Sociolinguistics: Proceedings of the 12th International Conference of AIDA held in Marseille from 30th May- 2nd June 2017.

Bouchhioua N (2019) Studies on Arabic Dialectology and Sociolinguistics - Proceedings of the 12th International Conference of AIDA held in Marseille from May 30th to June 2nd 2017

Brown G (2022) Computational modelling of segmental and prosodic levels of analysis for capturing variation across Arabic dialects in Speech Communication

Bruggeman A (2020) Acoustic correlates of lexical stress in Moroccan Arabic in Journal of the International Phonetic Association

Gargett, A. (2014) DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment

Hellmuth S (2018) Variation in polar interrogative contours within and between Arabic dialects

Hellmuth S (2015) F0 peak alignment in Moroccan Arabic polar questions.

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Engagement Activities


Description	The most significant achievement of the grant is the collection of a large parallel corpus of speech data, elicited for the purposes of intonational analysis, in eight dialects of Arabic. In addition we collected data in one dialect (Moroccan Arabic) with older speakers as well as younger speakers, and with speakers who are also fluent speakers of Tamazight. This additional data allows us to explore potential changes in progress and variation due to language contact in this dialect. Analysis of the data (using a mix of quantitative and qualitative techniques) confirms that there are clear differences in the 'basic' intonation patterns across Arabic dialects. In some cases we discovered intonation patterns that have not previously been described; for example, in Tunisian Arabic a specific rise-fall intonation contour is associated with a 'question marker' (a vowel added to the end of the word) and it is the combination of these two that turns a statement into a question (Hellmuth in press; Bouchhioua et al 2019). Similar detailed findings in individual dialects, as well as an overview of the scope of variation across dialects, will be documented in a forthcoming book length publication. We used tried and tested techniques to elicit our speech data, but additional work was required to adapt these for use in Arabic, due to the particular features of the Arabic language situation (that is, the fact that dialectal Arabic is, on the whole, unwritten). These methods and the rationale of the corpus design is set out in a recent book chapter (Hellmuth 2014). The methods used for prosodic analysis of the corpus data have evolved in line with recent advances in the field (D'Imperio, M., Cangemi, F., & Grice, M., 2016). As a result we moved away from an approach based primarily on qualitative analysis (manual prosodic transcription) to a mixed methods approach in which the results of qualitative analysis are compared to the results of quantitative analysis (visualisation of F0 contours and statistical analysis). We reflect on the merits of this approach in the methodology section of our forthcoming book length publication for Oxford University Press, 'Intonation in Spoken Arabic Dialects' (Hellmuth, in preparation). In addition to archiving of the full corpus (audio data + transcriptions) with the UK Data Service (completed), an interactive online searchable database has been constructed, and will be used to facilitate use of the data by non-academic users (allowing searches for individual dialects or sentence types, for example); updates on the availability of new tranches of the data via this interactive database will be made available on the project website: http://ivar.york.ac.uk/.
Exploitation Route	The findings of our research will be useful to learners and teachers of Arabic, who will benefit from the availability of descriptions of the pronunciation differences between different Arabic dialects of Arabic, and from the availability of sample sound recordings to download. To lay a foundation for this use, we produced a position paper explaining why, in particular, a description of the intonation patterns of different dialects may be useful for learners and teachers of Arabic (Hellmuth 2014). The paper takes research-led recommendations for teaching of the pronunciation of English as a starting point and explores what the equivalent recommendations would be for Arabic, taking into account the known differences between the two languages. We have also produced papers i) to show innovative methodology used to collect interactive data in languages such as Arabic where the written form of the language differs from the spoken form (Gargett et al 2014), and ii) to explore whether or not it is possible to detect traces of a person's mother tongue Arabic dialect when they are speaking English as a foreign language (Almbark et al 2014). Recordings from the IVAr database have been used in development of a prototype online training module designed to evaluate the extent to which 'lay listeners' (with no prior knowledge of linguistics or of Arabic dialects) can be trained to more reliably identify differences between spoken Arabic dialects. Data from the corpus have been used to investigate whether there is lexical stress in Moroccan Arabic, in collaboration with colleagues at the University of Cologne, and as input to testing of a system for automated accent detection (Y-ACCDIST) in collaboration with colleagues from Lancaster University. The corpus data and/or methods have been exploited directly and in depth in two completed PhD projects at the University of York, with four more in progress.
Sectors	Digital/Communication/Information Technologies (including Software) Education Government Democracy and Justice Security and Diplomacy
URL	http://ivar.york.ac.uk/


Description	The IVAr corpus has been used for testing and development of the Y-ACCDIST accent detection tool (Brown & Hellmuth, forthcoming). Y-ACCDIST is a computational tool which can be "used to inspect sociophonetic corpora as a preliminary "screening" tool" (Brown & Wormald 2017, JASA, p.422).
First Year Of Impact	2018
Sector	Digital/Communication/Information Technologies (including Software),Government, Democracy and Justice
Impact Types	Societal


Description	University of York ESRC Impact Acceleration Account (York ESRC IAA): Responsive Mode
Amount	£1,000 (GBP)
Funding ID	ESRC IAA Apr 2014-Mar 2019 ES/M500574/1
Organisation	University of York
Sector	Academic/University
Country	United Kingdom
Start	01/2018
End	03/2018


Description	University of York ESRC Impact Acceleration Account (York ESRC IAA): Standard Grant
Amount	£22,050 (GBP)
Funding ID	ESRC IAA Apr 2019-Mar 2023 ES/T502066/1
Organisation	University of York
Sector	Academic/University
Country	United Kingdom
Start	08/2021
End	08/2022


Title	Implementation of the ProsodyLab forced alignment tool for dialectal Arabic
Description	We adapted open source Python scripts distributed by the McGill prosodylab for the ProsodyLab Aligner forced alignment tool, for use for forced alignment of text transcriptions of the IVAr data to the audio recordings, resulting in time-aligned Praat textgrids at the word (and segment) level. An innovation in our lab was adaptation of the tools to ensure robust alignment of longer sound files (i.e. containing longer narratives and/or conversations).
Type Of Material	Improvements to research infrastructure
Provided To Others?	No
Impact	HMM models for each dialect analysed, and Praat textgrids automatically time-aligned at the word (and segment) level to audio recordings. Textgrids time-aligned at the word level will be made available alongside the audio files via the IVAr database.


Title	Intonational Variation in Arabic Corpus
Description	The Intonational Variation in Arabic (IVAr) corpus is one of the primary outputs of the IVAr project. It is a parallel corpus of speech data in eight dialects of Arabic (plus one bilingual sub-corpus dataset and one dataset collected with speakers in a different age range). Data collection was completed in September 2015. All of the read speech portions of the data are orthographically transcribed, using forced-alignment (time aligned to the digital audio signal). Transcriptions are also available for at least half of the spontaneous speech portions of the database. All speech data and all available transcriptions have been deposited with UKDS.
Type Of Material	Database/Collection of data
Year Produced	2017
Provided To Others?	Yes
Impact	Testing and development of the Y-ACCDIST accent detection tool (ongoing).
URL	http://reshare.ukdataservice.ac.uk/852878/


Description	BAB-MSA
Organisation	University of Jordan
Country	Jordan
Sector	Academic/University
PI Contribution	We have created a corpus of Boundary Annotated Broadcast Modern Standard Arabic (BAB-MSA) for input to computational analysis. The annotations are informed by our work on development of prosodic annotation protocols for regional Arabic dialects.
Collaborator Contribution	Our partners, Dr Claire Brierley (Leeds) and Majdi Sawalha (Jordan), then used the corpus to test a model of automated phrase break prediction.
Impact	This research is multidisciplinary: linguistics ~ computer science. The resulting journal article is currently awaiting further revision.
Start Year	2013


Description	BAB-MSA
Organisation	University of Leeds
Country	United Kingdom
Sector	Academic/University
PI Contribution	We have created a corpus of Boundary Annotated Broadcast Modern Standard Arabic (BAB-MSA) for input to computational analysis. The annotations are informed by our work on development of prosodic annotation protocols for regional Arabic dialects.
Collaborator Contribution	Our partners, Dr Claire Brierley (Leeds) and Majdi Sawalha (Jordan), then used the corpus to test a model of automated phrase break prediction.
Impact	This research is multidisciplinary: linguistics ~ computer science. The resulting journal article is currently awaiting further revision.
Start Year	2013


Description	DiVE-Arabic
Organisation	University of Birmingham
Country	United Kingdom
Sector	Academic/University
PI Contribution	In one of our fieldwork locations we collected an additional corpus of data elicited using a virtual world game environment developed by Andrew Gargett (University of Birmingham), and yields audio data which is time-aligned with a log of the actions (movements/orientations) in the virtual world. Dr Gargett is developing methods for annotation and/or analysis of the actions data.
Collaborator Contribution	We will provide prosodic annotation of the audio data, using the annotation protocols for the dialect in question, once these are developed (based on the main IVAr corpus data). Once the two levels of analysis are available we will have a rich resource for examining the role of prosody and intonation in situated dialogue in Arabic (for the first time).
Impact	DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment. / Gargett, Andrew; AlGethami, Ghazi; Hellmuth, Sam. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), 2014. This collaboration is multi-disciplinary: linguistics ~ computer science.
Start Year	2013


Description	Investigation of the correlates of stress in spoken Arabic dialects
Organisation	University of Manouba
Country	Tunisia
Sector	Academic/University
PI Contribution	We have collected parallel data in (so far) 8 dialects of Arabic, to determine the phonetic correlates of word level stress in each dialect, using an elicitation paradigm devised by Dr Bouchhioua. The resultant data will allow directly parallel comparison of the correlates of word stress across Arabic dialects for the first time. We will analyse the data after completion of the annotation of the main IVAr data for each dialect.
Collaborator Contribution	The elicitation paradigm was devised by our partner, Dr Nadia Bouchhioua of the Universite de la Manouba, Tunis, Tunisia. A journal article is currently under review.
Impact	Acquiring the phonetics and phonology of English word stress : Comparing learners from different L1 backgrounds. / Alhussein Almbark, Rana; Bouchhioua, Nadia; Hellmuth, Sam. In: Concordia Working Papers in Applied Linguistics, Vol. 5, 2014, p. 19-35.
Start Year	2013


Description	Language support for Arabic-speaking refugees
Organisation	Newcastle University
Country	United Kingdom
Sector	Academic/University
PI Contribution	"Arabic at Home" briefings for the families at the Refugee Council drop-in, Selby and for staff and volunteers of the Refugee Council, Leeds.
Collaborator Contribution	We delivered briefings on home language maintenance in Arabic for staff, volunteers and clients of the Refugee Council in North Yorkshire. The materials were prepared, and briefings delivered, by Sam Hellmuth/Rana Almbark (University of York) and Ghada Khattab (Newcastle University).
Impact	A UKRI grant application for work to improve pronunciation training for Syrian Arabic speaking learners of English and German is currently under review.
Start Year	2018


Description	Moroccan Arabic bilingual sub-corpus
Organisation	University of Hassan II Casablanca
Country	Morocco
Sector	Academic/University
PI Contribution	We collected a 'cluster' sub-corpus of the IVAr dataset in Casablanca, with data from bilingual speakers of Arabic and Tamazight, in two age groups.
Collaborator Contribution	Our local partners assisted with data collection and transcription in Morocco, and also travelled to UK to assist with initial data analysis.
Impact	Results of a pilot study on a portion of the data were presented at the 18th ICPhS conference: Hellmuth, S., Alhussein Almbark, R., Chlaihani, B., Louriz, N. (2015). F0 peak alignment in Moroccan Arabic polar questions. Proceedings of the 18th ICPhS, Glasgow. Data analysis of the full bilingual dataset (young speakers) is in progress and a journal article is currently under review.
Start Year	2015


Description	Y-ACCDIST accent detection
Organisation	Lancaster University
Department	Department of Linguistics and English Language
Country	United Kingdom
Sector	Academic/University
PI Contribution	Provision of data for testing of the Y-ACCDIST accent detection system for spoken dialects of Arabic.
Collaborator Contribution	Provision of accent detection tools for testing of the Y-ACCDIST accent detection system for spoken dialects of Arabic.
Impact	Initial scoping work funded by a small Responsive Mode award from the University of York ESRC Impact Acceleration Account allowed for Proof of Concept work in collaboration with an external commercial partner. A follow up collaborative project with a different external partner is ongoing in 2022, funded by a Main Scheme award from the University of York ESRC Impact Acceleration Account.
Start Year	2017


Description	Arabic at Home briefing: Refugee Council drop-in, Selby.
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Third sector organisations
Results and Impact	We delivered bilingual briefings on home language maintenance for Arabic to staff, volunteers and clients of the Refugee Council in North Yorkshire. The materials were prepared, and briefings delivered, by Sam Hellmuth/Rana Almbark (University of York) and Ghada Khattab (Newcastle University).
Year(s) Of Engagement Activity	2018
URL	https://ivar.york.ac.uk/outreach


Description	Radio broadcast (Word of Mouth)
Form Of Engagement Activity	A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	Participation in Radio 4 'Word of Mouth' programme on 'Intonation: the Music of Speech' focussed on variation in the form and function of intonation across languages.
Year(s) Of Engagement Activity	2017
URL	http://www.bbc.co.uk/programmes/b08dnrqd


Description	The role of language and language choices in participatory and collaborative work
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Third sector organisations
Results and Impact	An invited contribution to the York Migration Network Ideas Salon #2 on the role of language and language choices in participatory and collaborative work on migration. The talk included a linguist's response to a visit to the York Art Gallery 'The Sea is the Limit' exhibition and outlined planned work to support home language maintenance for refugee families settled in Yorkshire (and beyond).
Year(s) Of Engagement Activity	2018
URL	https://www.york.ac.uk/social-science/research/migration-network/events/2018/mignet-ideas-salon-2/

Abstract

Organisations

People

ORCID iD

Publications