Comparable and Parallel Corpus Approaches to the Third Code: English and Chinese Perspectives

Lead Research Organisation: Lancaster University
Department Name: Linguistics and English Language


With increasing globalisation, translation has nowadays played an ever more important role than before in helping West to meet East or vice versa. Translation Studies is an area of research that has been both aided greatly and advanced by the corpus methodology over the past two decades. As translating is a kind of mediated communication involving different languages, the effect of the source language on the translation is inevitable and strong enough to make the translational language perceptibly different from the target native language. The proposed research, which involves international collaboration between Lancaster University in the UK and the Hong Kong Polytechnic University, is a corpus-based study of the common linguistic and textual features of translations, from the perspectives of English and Chinese, two genetically distinct major languages in the world, by taking an innovative composite methodology that combines the comparable corpus approach and the parallel corpus approach to language studies on the basis of sizeable corpora of authentic language use in English and Chinese. The combination of i) the balanced coverage of usage contexts in language use, ii) the strict comparability of the corpora used, iii) the typological distance between English and Chinese as two distinctly different languages, iv) the innovative composite corpus-based approach integrating comparable corpus and parallel corpus analysis, and v) the inclusion of register variation as a variable, will enable the researchers to investigate the linguistic and textual features of translational language more systematically and achieve more reliable research outcomes than ever before. Research of this kind can cast new light on the translation process and help to uncover translation norms, which are of theoretical significance for Translation Studies as well as practical importance for translation teaching, translator training and translation practice.

Planned Impact

In addition to the academic users mentioned earlier, non-academic beneficiaries of the proposed research include a) translation students and trainee translators as well as their teachers; b) translation practitioners; and c) professional bodies of translation in multilingual societies such as the Hong Kong Translation Society. The research outcomes and the corpus resources to be produced on the proposed project will be of significant practical and pedagogical value to translation teaching and training, thus benefiting translation students and trainees as well as their teachers. Knowledge transfer based on the research data and outputs, including corpus-based data-driven learning (DDL) in translation teaching and translator training and the tool development for translation practitioners, will be explored as part of the project. In addition, translation practitioners and members of professional bodies will benefit from the public lectures, seminars and workshops based on the research outputs, data and tools produced in our research. (Please refer to the Pathways to Impact attachment for our plan to engage these non-academic users.)


10 25 50

publication icon
Xiao, R (2013) Translation universal hypotheses reevaluated from the Chinese perspective in (Video dissemination of conference plenary lecture)

publication icon
Xiao, R (2013) Corpus-based Translation Studies: A new framework for Translation Studies and translation teaching in (Video dissemination of conference panel session)

Description The most significant achievements of this on-going project include: 1) Three corpora have been created by the collaborative teams in the UK and HK as important empirical bases and new research resources for both academic and non-academic audience. These corpora are a one-million-word balanced comparable corpus of translated English (Corpus of Translational English) and two one-million-word balanced parallel corpora of English-to-Chinese and Chinese-to-English translation. 2) Based on the comparable and parallel corpora, a series of new knowledge with regard to the general linguistic and textual features of translational English and/or translational Chinese and variations in such features across registers and genres has been exhaustively explored and investigated. The monolingual typical features of translational language, as well as the cross-lingual commonalities and differences between translated English and translated Chinese, are important empirical and theoretical contributions to linguistic knowledge in general and to translation studies specifically. 3) An innovative composite methodology has been practised and improved throughout the research. The methodology not only combines comparable and parallel corpus approaches, but also incorporates statistical analyses of multiple linguistic features of texts in two genetically distant languages across various registers and genres. 4) In addition to the academia, knowledge transfer to non-academic beneficiaries was carried out in the second phase of the project. During post-project activities, we aim to disseminate the corpora and research results to translation teachers, trainees, practitioners and professional bodies of translation.
Exploitation Route Work which began in the second phase of the project and remains ongoing aims to disseminate the corpora and research findings to both academic and non-academic audience. The outputs of our research will be of interest and benefit to other researchers in corpus linguistics, contrastive and translation studies, and computational linguistics. We have disseminated some of the findings to the academia by giving presentations at conferences and workshops and submitting two research articles to international peer-refereed journals. And a monograph is also under preparation. To non-academic audience, we are still working on delivering the corpora and findings to translation teachers, students, practitioners and professional bodies of translations with the collaboration of the HK co-investigators. Our corpora will be useful data bases for translation training and practice, development of software tools for machine translation and translation memory. Some research findings are of pedagogical significance to translation training. We hope these findings can be incorporated in creating new textbooks and translation training programmes and raising translators' awareness of the specialities of translational language.
Sectors Digital/Communication/Information Technologies (including Software),Education,Culture, Heritage, Museums and Collections

Description With the second phase of the project now complete, we continue to work on disseminating our findings to academic and non-academic audience in the forms of conference presentation, publication of internationally peer-refereed journal articles, composition of a monograph, creation of a project website, holding seminars for translation practitioners and teachers, etc. The actual use of our research findings will be reported in more detail after these various vehicles for the realisation of impact have had time to reach their intended audience.
First Year Of Impact 2014
Sector Digital/Communication/Information Technologies (including Software),Education,Culture, Heritage, Museums and Collections
Impact Types Cultural,Societal

Title Corpus of Translational English 
Description A 1 million word balanced corpus of English texts translated from other languages. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact The COTE corpus is in use by our research partners. 
Description ESRC-RGC joint project 
Organisation Hong Kong Polytechnic University
Department Department of Chinese and Bilingual Studies
Country Hong Kong 
Sector Academic/University 
PI Contribution Taking the comparable corpus approach to the third code.
Collaborator Contribution Taking the parallel corpus approach to the third code.
Impact Research based on comparable and parallel corpora is currently ongoing at both sides of the collaboration. The second phase of the project will involve knowledge transfer to achieve wider impacts.
Start Year 2013