Inference and analysis of gene genealogies from large genomic data sets

Lead Research Organisation: University of Oxford

Abstract

The genealogical relationship between individuals gives rise to a complex correlation structure in their genetic sequences. At each genome position, we can trace the ancestry of all individuals by considering a coalescent process where the lineages of individuals are merged until the most recent common ancestor for all individuals is reached. These genealogical relationships change along the genome as a result of historical recombination events, giving rise to intricate structures that can be represented using a graph. These graphs encode the evolutionary history of the analyzed samples. If inferred from genomic data, they can be used to study the evolutionary history of the samples and, when phenotype and environmental data is available, inform the analysis of heritable traits. We will develop statistical and algorithmic methods to reconstruct and analyze these representations of genealogical relatedness. These methods will allow performing new analyses of human evolutionary history (e.g. infer historical population size variation and relationships across groups) and complex traits (e.g. estimate heritability, perform polygenic prediction, detect association between genetic variation and diseases). We will develop and distribute scientific software that implement these novel methodologies.
Research areas: Mathematical Biology; Biological informatics.

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2633452 Studentship EP/S02428X/1 01/10/2021 30/09/2025 Jiazheng Zhu