Intonational Variation in Arabic
Lead Research Organisation:
University of York
Department Name: Language and Linguistic Science
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Organisations
- University of York (Lead Research Organisation)
- University of Manouba (Collaboration)
- University of Jordan (Collaboration)
- University of Hassan II Casablanca (Collaboration)
- Lancaster University (Collaboration)
- Newcastle University (Collaboration)
- UNIVERSITY OF BIRMINGHAM (Collaboration)
- UNIVERSITY OF LEEDS (Collaboration)
People |
ORCID iD |
Samantha Hellmuth (Principal Investigator) |
Publications
Almbark R
(2019)
Is there an interlanguage intelligibility benefit in perception of English word stress?
in Loquens
Brown G
(2022)
Computational modelling of segmental and prosodic levels of analysis for capturing variation across Arabic dialects
in Speech Communication
Bruggeman A
(2020)
Acoustic correlates of lexical stress in Moroccan Arabic
in Journal of the International Phonetic Association
Gargett, A.
(2014)
DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment
Hellmuth S
(2015)
F0 peak alignment in Moroccan Arabic polar questions.
Description | The most significant achievement of the grant is the collection of a large parallel corpus of speech data, elicited for the purposes of intonational analysis, in eight dialects of Arabic. In addition we collected data in one dialect (Moroccan Arabic) with older speakers as well as younger speakers, and with speakers who are also fluent speakers of Tamazight. This additional data allows us to explore potential changes in progress and variation due to language contact in this dialect. Analysis of the data (using a mix of quantitative and qualitative techniques) confirms that there are clear differences in the 'basic' intonation patterns across Arabic dialects. In some cases we discovered intonation patterns that have not previously been described; for example, in Tunisian Arabic a specific rise-fall intonation contour is associated with a 'question marker' (a vowel added to the end of the word) and it is the combination of these two that turns a statement into a question (Hellmuth in press; Bouchhioua et al 2019). Similar detailed findings in individual dialects, as well as an overview of the scope of variation across dialects, will be documented in a forthcoming book length publication. We used tried and tested techniques to elicit our speech data, but additional work was required to adapt these for use in Arabic, due to the particular features of the Arabic language situation (that is, the fact that dialectal Arabic is, on the whole, unwritten). These methods and the rationale of the corpus design is set out in a recent book chapter (Hellmuth 2014). The methods used for prosodic analysis of the corpus data have evolved in line with recent advances in the field (D'Imperio, M., Cangemi, F., & Grice, M., 2016). As a result we moved away from an approach based primarily on qualitative analysis (manual prosodic transcription) to a mixed methods approach in which the results of qualitative analysis are compared to the results of quantitative analysis (visualisation of F0 contours and statistical analysis). We reflect on the merits of this approach in the methodology section of our forthcoming book length publication for Oxford University Press, 'Intonation in Spoken Arabic Dialects' (Hellmuth, in preparation). In addition to archiving of the full corpus (audio data + transcriptions) with the UK Data Service (completed), an interactive online searchable database has been constructed, and will be used to facilitate use of the data by non-academic users (allowing searches for individual dialects or sentence types, for example); updates on the availability of new tranches of the data via this interactive database will be made available on the project website: http://ivar.york.ac.uk/. |
Exploitation Route | The findings of our research will be useful to learners and teachers of Arabic, who will benefit from the availability of descriptions of the pronunciation differences between different Arabic dialects of Arabic, and from the availability of sample sound recordings to download. To lay a foundation for this use, we produced a position paper explaining why, in particular, a description of the intonation patterns of different dialects may be useful for learners and teachers of Arabic (Hellmuth 2014). The paper takes research-led recommendations for teaching of the pronunciation of English as a starting point and explores what the equivalent recommendations would be for Arabic, taking into account the known differences between the two languages. We have also produced papers i) to show innovative methodology used to collect interactive data in languages such as Arabic where the written form of the language differs from the spoken form (Gargett et al 2014), and ii) to explore whether or not it is possible to detect traces of a person's mother tongue Arabic dialect when they are speaking English as a foreign language (Almbark et al 2014). Recordings from the IVAr database have been used in development of a prototype online training module designed to evaluate the extent to which 'lay listeners' (with no prior knowledge of linguistics or of Arabic dialects) can be trained to more reliably identify differences between spoken Arabic dialects. Data from the corpus have been used to investigate whether there is lexical stress in Moroccan Arabic, in collaboration with colleagues at the University of Cologne, and as input to testing of a system for automated accent detection (Y-ACCDIST) in collaboration with colleagues from Lancaster University. The corpus data and/or methods have been exploited directly and in depth in two completed PhD projects at the University of York, with four more in progress. |
Sectors | Digital/Communication/Information Technologies (including Software) Education Government Democracy and Justice Security and Diplomacy |
URL | http://ivar.york.ac.uk/ |
Description | The IVAr corpus has been used for testing and development of the Y-ACCDIST accent detection tool (Brown & Hellmuth, forthcoming). Y-ACCDIST is a computational tool which can be "used to inspect sociophonetic corpora as a preliminary "screening" tool" (Brown & Wormald 2017, JASA, p.422). |
First Year Of Impact | 2018 |
Sector | Digital/Communication/Information Technologies (including Software),Government, Democracy and Justice |
Impact Types | Societal |
Description | University of York ESRC Impact Acceleration Account (York ESRC IAA): Responsive Mode |
Amount | £1,000 (GBP) |
Funding ID | ESRC IAA Apr 2014-Mar 2019 ES/M500574/1 |
Organisation | University of York |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2018 |
End | 03/2018 |
Description | University of York ESRC Impact Acceleration Account (York ESRC IAA): Standard Grant |
Amount | £22,050 (GBP) |
Funding ID | ESRC IAA Apr 2019-Mar 2023 ES/T502066/1 |
Organisation | University of York |
Sector | Academic/University |
Country | United Kingdom |
Start | 08/2021 |
End | 08/2022 |
Title | Implementation of the ProsodyLab forced alignment tool for dialectal Arabic |
Description | We adapted open source Python scripts distributed by the McGill prosodylab for the ProsodyLab Aligner forced alignment tool, for use for forced alignment of text transcriptions of the IVAr data to the audio recordings, resulting in time-aligned Praat textgrids at the word (and segment) level. An innovation in our lab was adaptation of the tools to ensure robust alignment of longer sound files (i.e. containing longer narratives and/or conversations). |
Type Of Material | Improvements to research infrastructure |
Provided To Others? | No |
Impact | HMM models for each dialect analysed, and Praat textgrids automatically time-aligned at the word (and segment) level to audio recordings. Textgrids time-aligned at the word level will be made available alongside the audio files via the IVAr database. |
Title | Intonational Variation in Arabic Corpus |
Description | The Intonational Variation in Arabic (IVAr) corpus is one of the primary outputs of the IVAr project. It is a parallel corpus of speech data in eight dialects of Arabic (plus one bilingual sub-corpus dataset and one dataset collected with speakers in a different age range). Data collection was completed in September 2015. All of the read speech portions of the data are orthographically transcribed, using forced-alignment (time aligned to the digital audio signal). Transcriptions are also available for at least half of the spontaneous speech portions of the database. All speech data and all available transcriptions have been deposited with UKDS. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | Testing and development of the Y-ACCDIST accent detection tool (ongoing). |
URL | http://reshare.ukdataservice.ac.uk/852878/ |
Description | BAB-MSA |
Organisation | University of Jordan |
Country | Jordan |
Sector | Academic/University |
PI Contribution | We have created a corpus of Boundary Annotated Broadcast Modern Standard Arabic (BAB-MSA) for input to computational analysis. The annotations are informed by our work on development of prosodic annotation protocols for regional Arabic dialects. |
Collaborator Contribution | Our partners, Dr Claire Brierley (Leeds) and Majdi Sawalha (Jordan), then used the corpus to test a model of automated phrase break prediction. |
Impact | This research is multidisciplinary: linguistics ~ computer science. The resulting journal article is currently awaiting further revision. |
Start Year | 2013 |
Description | BAB-MSA |
Organisation | University of Leeds |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We have created a corpus of Boundary Annotated Broadcast Modern Standard Arabic (BAB-MSA) for input to computational analysis. The annotations are informed by our work on development of prosodic annotation protocols for regional Arabic dialects. |
Collaborator Contribution | Our partners, Dr Claire Brierley (Leeds) and Majdi Sawalha (Jordan), then used the corpus to test a model of automated phrase break prediction. |
Impact | This research is multidisciplinary: linguistics ~ computer science. The resulting journal article is currently awaiting further revision. |
Start Year | 2013 |
Description | DiVE-Arabic |
Organisation | University of Birmingham |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | In one of our fieldwork locations we collected an additional corpus of data elicited using a virtual world game environment developed by Andrew Gargett (University of Birmingham), and yields audio data which is time-aligned with a log of the actions (movements/orientations) in the virtual world. Dr Gargett is developing methods for annotation and/or analysis of the actions data. |
Collaborator Contribution | We will provide prosodic annotation of the audio data, using the annotation protocols for the dialect in question, once these are developed (based on the main IVAr corpus data). Once the two levels of analysis are available we will have a rich resource for examining the role of prosody and intonation in situated dialogue in Arabic (for the first time). |
Impact | DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment. / Gargett, Andrew; AlGethami, Ghazi; Hellmuth, Sam. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), 2014. This collaboration is multi-disciplinary: linguistics ~ computer science. |
Start Year | 2013 |
Description | Investigation of the correlates of stress in spoken Arabic dialects |
Organisation | University of Manouba |
Country | Tunisia |
Sector | Academic/University |
PI Contribution | We have collected parallel data in (so far) 8 dialects of Arabic, to determine the phonetic correlates of word level stress in each dialect, using an elicitation paradigm devised by Dr Bouchhioua. The resultant data will allow directly parallel comparison of the correlates of word stress across Arabic dialects for the first time. We will analyse the data after completion of the annotation of the main IVAr data for each dialect. |
Collaborator Contribution | The elicitation paradigm was devised by our partner, Dr Nadia Bouchhioua of the Universite de la Manouba, Tunis, Tunisia. A journal article is currently under review. |
Impact | Acquiring the phonetics and phonology of English word stress : Comparing learners from different L1 backgrounds. / Alhussein Almbark, Rana; Bouchhioua, Nadia; Hellmuth, Sam. In: Concordia Working Papers in Applied Linguistics, Vol. 5, 2014, p. 19-35. |
Start Year | 2013 |
Description | Language support for Arabic-speaking refugees |
Organisation | Newcastle University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | "Arabic at Home" briefings for the families at the Refugee Council drop-in, Selby and for staff and volunteers of the Refugee Council, Leeds. |
Collaborator Contribution | We delivered briefings on home language maintenance in Arabic for staff, volunteers and clients of the Refugee Council in North Yorkshire. The materials were prepared, and briefings delivered, by Sam Hellmuth/Rana Almbark (University of York) and Ghada Khattab (Newcastle University). |
Impact | A UKRI grant application for work to improve pronunciation training for Syrian Arabic speaking learners of English and German is currently under review. |
Start Year | 2018 |
Description | Moroccan Arabic bilingual sub-corpus |
Organisation | University of Hassan II Casablanca |
Country | Morocco |
Sector | Academic/University |
PI Contribution | We collected a 'cluster' sub-corpus of the IVAr dataset in Casablanca, with data from bilingual speakers of Arabic and Tamazight, in two age groups. |
Collaborator Contribution | Our local partners assisted with data collection and transcription in Morocco, and also travelled to UK to assist with initial data analysis. |
Impact | Results of a pilot study on a portion of the data were presented at the 18th ICPhS conference: Hellmuth, S., Alhussein Almbark, R., Chlaihani, B., Louriz, N. (2015). F0 peak alignment in Moroccan Arabic polar questions. Proceedings of the 18th ICPhS, Glasgow. Data analysis of the full bilingual dataset (young speakers) is in progress and a journal article is currently under review. |
Start Year | 2015 |
Description | Y-ACCDIST accent detection |
Organisation | Lancaster University |
Department | Department of Linguistics and English Language |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Provision of data for testing of the Y-ACCDIST accent detection system for spoken dialects of Arabic. |
Collaborator Contribution | Provision of accent detection tools for testing of the Y-ACCDIST accent detection system for spoken dialects of Arabic. |
Impact | Initial scoping work funded by a small Responsive Mode award from the University of York ESRC Impact Acceleration Account allowed for Proof of Concept work in collaboration with an external commercial partner. A follow up collaborative project with a different external partner is ongoing in 2022, funded by a Main Scheme award from the University of York ESRC Impact Acceleration Account. |
Start Year | 2017 |
Description | Arabic at Home briefing: Refugee Council drop-in, Selby. |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Third sector organisations |
Results and Impact | We delivered bilingual briefings on home language maintenance for Arabic to staff, volunteers and clients of the Refugee Council in North Yorkshire. The materials were prepared, and briefings delivered, by Sam Hellmuth/Rana Almbark (University of York) and Ghada Khattab (Newcastle University). |
Year(s) Of Engagement Activity | 2018 |
URL | https://ivar.york.ac.uk/outreach |
Description | Radio broadcast (Word of Mouth) |
Form Of Engagement Activity | A broadcast e.g. TV/radio/film/podcast (other than news/press) |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Participation in Radio 4 'Word of Mouth' programme on 'Intonation: the Music of Speech' focussed on variation in the form and function of intonation across languages. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.bbc.co.uk/programmes/b08dnrqd |
Description | The role of language and language choices in participatory and collaborative work |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Third sector organisations |
Results and Impact | An invited contribution to the York Migration Network Ideas Salon #2 on the role of language and language choices in participatory and collaborative work on migration. The talk included a linguist's response to a visit to the York Art Gallery 'The Sea is the Limit' exhibition and outlined planned work to support home language maintenance for refugee families settled in Yorkshire (and beyond). |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.york.ac.uk/social-science/research/migration-network/events/2018/mignet-ideas-salon-2/ |