Development of statistical methods to evaluate the functional impact of non-coding genetic variants

Lead Research Organisation: Newcastle University
Department Name: Institute of Cellular Medicine

Abstract

Keywords: Analytical science, biological informatics.
Summary;
Next generation sequencing of exomes has revolutionized the genetic diagnosis of a broad spectrum of diseases, allowing the possibility to detect mutations within most genes of the human genome. However, this strategy does not address the possibility of pathogenic mutations that lie in non-coding regions of the genome. For example, in some patients with primary immune immunodeficiency we can observe clinical phenotypes that are compatible with mutations of known disease-associated genes, in the absence of any mutation within the relevant coding regions. Where accompanied by alterations at transcriptional level, these similar phenotypes suggest the possible dysregulation of the same disease genes by mutations in non-coding regulatory regions that influence their expression.
Whole genome sequencing (WGS) allows the exploration and potential discovery of such non-coding genomic variants. However, non-coding regions are highly polymorphic and, therefore, it is still a challenge to identify potential disease-associated mutations with confidence. Furthermore, annotation of the noncoding space has until recently been held back by the complexity and cell-type specificity of enhancer activity and chromatin accessibility. In this project, we will take advantage of the recent explosion of expression quantitative trait loci (eQTL) data and epigenomic information to refine our ability to identify potential deleterious variants within cell type-specific regulatory regions.
There has been a recent expansion in the development of computational tools and methodologies that utilise statistical models to enable the interrogation of non-coding variants, each with their own advantages and disadvantages. Machine learning for example is already being tested as a means to predicting non-coding passenger mutations in cancer genomes (Yang et al., 2016). The main challenges are to integrate all the current sources of information in a common framework and develop new creative strategies for the prediction of damaging variants.
In this project, the student will develop new methodologies to predict non-coding variants that drive the dysregulation of clinically relevant disease genes to improve their diagnosis and future therapy. The main focus will be in exploiting cell type-specific data that are relevant for each disease. The analytical protocols will be tested in WGS data from patients with primary immunodeficiencies and further validated with WGS data from other diseases with different cell types affected, such as neurological diseases. Where available, transcriptional data will be used to test the validity of variant predictions integrated into the analysis. In the era of genomic medicine, such protocols have potentially broad significance.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509528/1 01/10/2016 31/03/2022
1961022 Studentship EP/N509528/1 01/10/2017 30/09/2020 Maninder Heer