Using signatures of mutation and selection in genome evolution to interpret contemporary genetic variation

Lead Research Organisation: MRC Human Genetics Unit


Improving our understanding of genetic differences between species allows us to better interpret genetic risk in people.|We are all at risk of developing a wide range of diseases, some very common, including heart disease, diabetes, dementia and cancer. But such risks differ hugely between individuals, and are to a large degree influenced by the sequence of DNA in our cells.|The big question is which of the many thousands of DNA differences between individuals are responsible for increasing or decreasing their risk of developing a given disease. The historic record of evolution can provide some answers. We can read it as the differences in DNA between species, for example human versus mouse. The pattern of differences between species can reveal functionally important regions of DNA. Contrasting the between species pattern with the differences between people can point to the critically important changes that influence disease risk.|More broadly, we compare how DNA has changed between species with the differences between people. This allows us to study why and where DNA changes (mutations) arise, and what the functional consequences of those changes are. We are applying these methods to understand the genetic basis of many rare and common diseases.

Technical Summary

We are endeavouring to understand the processes of selection and mutation that are acting to shape genomes. It is well known that selection shapes the pattern of genomic changes that accumulate as populations diverge from a common ancestor. This has proven to be a very useful signature for the identification of functionally important regions of the genome. But it is becoming increasingly clear that variation in the pattern of new mutations can also generate superficially similar signals. Improved separation of these confounding evolutionary signatures is crucial if we are to understand how organisms evolve, and to relate contemporary genetic changes to human biology and disease.||This work is multi-disciplinary, operating at the intersection of population genetics, evolutionary genomics and functional genomics, we integrating and processing large (multi-genome scale) datasets. Genetic changes that have accumulated between species tell us about the combined effects of both mutation pattern and selection. In contrast, mutations that are still segregating in populations (as rare variants or polymorphisms) show the same impact of mutation but the consequences of selection are more subtle. We leverage the differences of between-species and within-species variation to separate the patterns of mutation from those imposed by selection. ||The results of this research fall into three broad categories: |An improved understanding of mutational mechanisms - why and where particular types of mutation occur. This is of particular importance in understanding the progression of cancers and explaining why some genes appear to carry a higher detrimental mutational load in the population than others.|Better detection of both purifying and diversifying selection for the interpretation of function and relating functional genomic measures of the genome (e.g. chromatin, origins of replication, regulatory sequences) to evolution.|The development and application of methods to identify genes, pathways and biological systems enriched for deleterious mutations associated with human genetic diseases and cancer. These approaches build on established techniques from the field of evolutionary biology and can be applied to all genetic diseases.
Title DeepJigsaw 
Description Software system to process the results of randomly concatenated and high-throughput sequenced PCR products, their genome alignment and calling of polymorphisms. 
Type Of Material Improvements to research infrastructure 
Year Produced 2010 
Provided To Others? Yes  
Impact Manuscript published (PMID: 21131973), further ongoing research. 
Title Genome alignment manipulation 
Description Software methods to manipulate whole-genome scale multiple sequence alignments. 
Type Of Material Improvements to research infrastructure 
Year Produced 2010 
Provided To Others? Yes  
Impact Application to ongoing research by multiple groups. 
Title MHC Transcripto-spliceome database 
Description A database and software system for the design of a complex, tilling and splice junction microarray system and the analysis of derived results. 
Type Of Material Improvements to research infrastructure 
Year Produced 2011 
Provided To Others? Yes  
Impact One manuscript published (PMID: 21628452), micro-array chip and the database are currently in use by other groups. 
Description Accurate measurement of nucleic acids 
Organisation Laboratory of the Government Chemist (LGC) Ltd
Country United Kingdom 
Sector Private 
PI Contribution Development of novel methods to explore the quantisation of DNA methylation.
Collaborator Contribution Preparation and generation of known reference data.
Impact PMID:22841564 PMID:25539843
Start Year 2012
Description FANTOM5 Consortium 
Organisation RIKEN
Department Omics Science Center
Country Japan 
Sector Public 
PI Contribution We are leading the evolutionary based analysis of this data. Comparing the patterns of transcriptional regulation between species to understand how the regulatory networks have evolved. We are also contributing to the primary filtering, quality control and interpretation of the data.
Collaborator Contribution Provision of exceptional primary data on which research is based.
Impact The FANTOM5 Consortium is a a multi-national, multi-disciplinary project to investigate transcriptional regulation across mammalian genomes. The project traverses the fields of genomics, immunology, neuroscience, computational and mathematical biology.
Start Year 2010
Description FANTOM6 Consortium 
Organisation RIKEN
Department Institute of Physical and Chemical Research (RIKEN)
Country Japan 
Sector Public 
PI Contribution Planning of large scale systematic study on lncRNA and their effect on gene regulation. Planning and initiating analysis of the resulting data.
Collaborator Contribution Planning, coordination and primary data generation.
Impact Project is ongoing - no impact yet.
Start Year 2015
Description mir-941 
Organisation Chinese Academy of Sciences
Department CAS-MPG Partner Institute for Computational Biology (PICB)
Country China 
Sector Academic/University 
PI Contribution Data analysis, in-particular to show how the miR-941 locus evolved through the primate lineage from an evolutionarily volatile tandemly repetitive sequence. Writing and editing of the manuscript, preparation of figures.
Collaborator Contribution Generation of molecular biological data, genetic data analysis.
Impact Manuscript published PMID: 23093182. Agreement to seek joint funding to support further studies. Public engagement of science - this group took the lead in communicating the results and insights of this work to the general public: press release, radio interviews and articles in the poplar media (e.g. The Times, The Independent and news agencies across the world). See public engagement section.
Start Year 2012
Description A-IMBN of mice and men 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Type Of Presentation Paper Presentation
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Assistance drafting a broad audience highlight of our recently published work. Wide audience in the Asia-Pacific biomedical research community.

Positive feedback from Australian and Japanese researchers.
Year(s) Of Engagement Activity 2012
Description Edinburgh International Science Festival 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Type Of Presentation Workshop Facilitator
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact One on one interaction with over 100 children and their families. Demonstration, answering questions and provoking thought. Two doctoral candidate students participated in this activity.

Enthusiastic participation from children and parents. Answering more in-depth questions on genetics from the general public.
Year(s) Of Engagement Activity 2011,2012