Integrated gene and brain mapping of language and reading abilities

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Philosophy Psychology & Language


The ability to communicate using language is unique to humans. Language is central to our social world, and associated reading and writing skills are central to education and daily life. People vary in how well they comprehend and use language, and in their abilities to read and write, our research aims to better understand the biology behind these individual differences in language and reading abilities in the general population. Most genetic research in this area has focussed on children who have difficulties acquiring language and reading skills, but these sample sizes are too small to find the common genes of very small individual effect that are likely to influence normal variation in these traits. Our research is novel because it collects language and reading measures in adults, and can therefore rapidly attain sample sizes large enough to make genetic discoveries. The sample that we focus on is special for three main reasons. 1) They already have genome-wide genotyping data available, so we can run our gene discovery analysis (genome-wide association; GWA) as soon as language and reading skills have been measured in this sample. 2) A subsample have recently had their brain scanned using magnetic resonance imaging (MRI), so we can investigate how their language and reading skills relate to structural features and networks of the brain. Importantly, some of this sample have historical measures of reading skill in childhood (7-11 years), so that we can test whether their childhood reading scores are related to the same brain structures and networks measured in adulthood. 3) The sample have genome-wide methylation data available. Methylation is a chemical mark that can attach to DNA and affect gene expression and modify gene function. This type of epigenetic (non-genetic influence on gene expression) marker has not previously been tested for its association with language and reading skills, our study will provide the first investigation of this. Our GWA results will be combined with those from international GWA cohorts to produce the largest GWA of language and reading abilities. We can further boost power of gene discovery by using brain MRI markers in a multivariate GWA meta-analysis with language/reading measures. Finally, we will perform an integrated analysis of gene expression, methylation, and genetic data to find further novel genes associated with language and reading abilities, and to better understand the role of variants that lie outside genes. Our novel approach of combining genetic, epigenetic, and brain imaging research streams will provide a comprehensive framework for understanding the biology of language and reading abilities, enabling the important link from genes to brain to behaviour.

Technical Summary

Genome-wide association (GWA) has revolutionised the discovery of genes and biological pathways influencing complex traits, but not in the area of language and reading abilities where small samples of children and adolescents limit the sample size and resulting power required for successful GWA. The proposed research aims to accelerate the discovery of genes and biological pathways influencing language and reading traits by three main methods. Firstly, by using a novel sampling strategy focussed on adults to achieve a large GWA sample (N ~ 10,000), which will then be meta-analysed with GWA results from international cohorts (Total N~ 36,400). Secondly, by identifying brain imaging structural and connectivity endophenotypes of language and reading abilities in a very large single magnetic resonance imaging (MRI) study (N ~ 1,200; a subsample of the GWA sample). These results will inform a multivariate GWA meta-analysis of language/reading and MRI traits to further increase statistical power for gene discovery. Thirdly, by investigating associations between genome-wide methylation and language and reading abilities to establish epigenetic associations. This will be the first epigenome-wide association study (EWAS) of these traits in a sample (N ~ 10,000) that exceeds current EWAS of other cognitive abilities. Importantly, the data generated by these three main research streams will be integrated with publicly available methylation and expression QTL data to find further novel genetic variants influencing language and reading abilities, and to better characterise the role of newly discovered GWA variants, especially those in noncoding regions. This multi-layered approach of genetics, neuroimaging, and epigenetics will transform our understanding of the biological pathways connecting genes and brain to language and reading behaviour.

Planned Impact

This project targets the biological origins of language and reading abilities, essential cognitive skills known to influence educational and occupational success, and linked to health. The first group of potential beneficiaries is therefore people with low reading ability. Increased understanding of the biological mechanisms underpinning reading and language abilities has the potential to improve the quality of life of those who struggle to develop these skills. Such people tend to be overrepresented in the prison system and have more limited occupational opportunities.
These individual and social outcomes also lead to economic impacts on national wealth. In European countries, the economic cost of illiteracy is roughly 2% of gross domestic product, for example, in the UK roughly £81 billion is lost to the economy each year (World literacy Foundation, 2015 economic report).
One specific example of the translational impact of this research is the potential to improve the prediction of those people who are genetically predisposed to language and reading difficulties. Reading interventions are most successful if they are implemented early, meaning that such diagnostic genetic tests could even be performed prior to reading instruction, maximising the success of remediation. The development of polygenic prediction models based on our meta-analysis genome-wide association (GWA) findings could commence on completion of our project and is a downstream project with relevance to education that we are very interested in pursuing. Our summary GWA results will also be made publicly available so that other research groups with such interests can construct polygenic scores in their own data. Polygenic prediction models could be improved by incorporating environmental predictors like quality of early social interaction and phonics instruction (for reading). This would enable optimal environments (including intervention) for language/reading acquisition to be provided earlier in achild's development. There might also be applications for improving the acquisition of these skills in the general population. Once we identify neuroimaging correlates of normal language and reading abilities, this will provide a knowledge base for downstream research to investigate whether particular learning strategies more efficiently build up/consolidate the brain networks involved in reading and writing. If such learning strategies were implemented within education, this would produce a more literate society, with associated benefits to employment and health, and lowered risk of delinquency. Thus improving a person's individual quality of life and the economic and social fabric of society as a whole.
The above outlines some of the longer term potential impacts, but there is also scope for people to benefit from the research during the life-cycle of the project itself. For example, health professionals who identify reading/language problems and educators who teach children how to read may benefit from the project's findings. A better understanding of genetics and reading more generally may enable these groups to discuss reading ability with parents more confidently. From previous engagement activities, the PI knows this is an area which is often misunderstood, with people sometimes being wary of what they perceive as a deterministic approach to literacy. This project offers the opportunity to open up a dialogue with educators and the public about how genetics data could be used to improve educational and social outcomes. More detail of how we will engage with these groups is given in the Pathways to Impact section.


10 25 50