Developing and Exploiting an ESAP Corpus: Design, Compilation, Exploration and the Creation of an Online Lexical Tool

Lead Research Organisation: University of Warwick
Department Name: Centre for Applied Linguistics

Abstract

English for Specific Academic Purposes (ESAP) corpora engage with current conceptions of academic literacies by acknowledging 'the literacy demands of the curriculum as involving a variety of communicative practices, including genres, fields and disciplines' (Lea and Street, 1998: 159). However, because they are typically compiled by individuals for research purposes (Krishnamurthy and Kosem, 2007: 35; Alsop and Nesi, 2009: 72), they remain largely private collections of texts, i.e. they are not in the public domain. The aim of this project is to develop and exploit a large ESAP corpus of Arts and Humanities texts, and make the corpus and findings freely available online through a dedicated website for use in EAP pedagogy and research.

The first stage of this project will be to design and compile the corpus. It will be structured based on empirical linguistic evidence (e.g. Durrant, 2009), using PhD theses as the source texts because they represent assessed student writing at the highest level of academic literacy (Thompson, 2005) and are heavily impacted by discipline specific conventions (Swales, 2004: 103). The corpus will need to be large in order 'to analyse rarer items, and to detect the finer details of language use' (Krishnamurthy, 2000: 175), e.g. academic words, bundles and collocations. The final AH corpus will be made freely available online to EAP students, teachers and researchers.

Following this, the researcher will use the corpus to explore important linguistic and discoursal features of AH texts, particularly in terms of specialised vocabulary. This is important because the majority of existing listings of academic vocabulary are generic and therefore provide an inadequate foundation for understanding disciplinary conventions (see Hyland and Tse, 2007; Hyland, 2008; Durrant, 2009). This mixed-method corpus exploration, using quantitative data (corpus findings) and qualitative data (expert-judgement by EAP practitioners), will address the disciplinary variation found in written academic discourse, achieving genuinely new findings that will have theoretical and practical significance for EAP researchers and teachers.

Finally, because there is evidence to suggest that actual classroom use of corpora is not widespread (see Jarvis, 2004; Gilbert, 2013; Timmis, 2015), it is important that findings from the corpus creation and exploration are fed into the development of a free and user-friendly online lexical tool for use in ESAP teaching-learning. This tool will be piloted in an experimental ESAP course in order to gauge its efficacy for teaching-learning purposes. Unlike existing online lexical tools (e.g. Compleat Lexical Tutor), this discipline-specific online lexical tool will engage with current conceptions of academic literacies, representing the first of its kind and having practical implications for ESAP teaching-learning in the UK and beyond.

The project will address the following research questions:
1. What are the issues and processes of compiling a large AH corpus that will be made available online?
2. What are some of the most important, frequent and significant features of the vocabulary of AH discourse, and how are they similar/ different to comparable features of general academic discourse?
3. Can an AH online lexical tool help students improve their vocabulary knowledge in a user-friendly manner?

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/P000711/1 01/10/2017 30/09/2027
2108578 Studentship ES/P000711/1 01/10/2019 25/06/2024 James O'Flynn