Systematic characterisation of genetically influenced 'omics' phenotypes and disease modules within biological networks

Lead Research Organisation: University of Cambridge
Department Name: Public Health and Primary Care

Abstract

Introduction: Identification of biological pathways associated with diseases and functional characterisation of changes that perturb biological processes is the key to understanding disease aetiology, prognosis and prevention. Recent studies have been successful in identifying the association of genetic variants and potentially causal genes with various proteins, metabolites and lipids and their influence on key biological pathways that are associated with diseases [1-6]. In addition, there is increasing evidence of genetic overlap between unrelated diseases and traits that point to a shared aetiology of diseases [7-8]. The aim of the proposed project is to understand the shared cause of diseases by combining multi-omics (proteins, metabolites and lipids) phenotype data, genetic association with multi-omics phenotypes and diseases and electronic health records (EHR) within the INTERVAL Bioresource.

Background and Aim: Analyses of the vast amount of data from these high dimensional analyses is often difficult without constructing an interactive biological network. Gaussian Graphical Modelling (GGM) allows the construction of a biological (omics) network, and an automated feature detection algorithm will enable the extraction of disease modules from this network. Here disease module refers to a set of proteins/lipids/metabolites and associated pathways that are picked by the algorithm as associated with diseases (eg. CHD, Type II Diabetes etc). Further functional characterisation and computational follow-up of these disease modules using clinical measures in EHR will lead to the identification of novel genes and pathways associated with diseases. This will also improve our understanding of the shared cause of diseases. ie. if there's a change in the gene that's associated with reduced risk of CHD and T2D but an increase in Asthma.

The project will primarily focus on 1. Developing an interactive web resource that will allow the investigation of phenotype data and genetic association with the phenotypes measured within the INTERVAL bioresource. 2. Developing a supervised learning method to identify disease modules within the biological network 3. Investigate disease modules and underlying biological pathways using electronic health records (EHR) to understand shared aetiology of diseases

Methods: The proposed work will focus on constructing GGMs for the multi-omics data (proteins, metabolites and lipids) with edges representing the partial correlation between two phenotype measures conditioned for other variables within the model. Meta-data which include genetic associations with multi-omics phenotypes from Genome-Wide Association Studies (GWAS), genetic association with diseases from public databases (PhenoScanner, SNiPA etc.) and biological pathways (KEGG) will be added to the network to allow the identification of molecular pathways and disease modules using supervised learning methods. This will be particularly useful as an automated tool to detect disease modules within a biological network and inform mendelian randomisation studies (MR) to understand the shared aetiology of diseases. These genetically influenced disease modules can be tested for association across Electronic Health Record (HER) phenotypes (eg. Diagnosis, presence or absence of multi-morbidity and drug responses). This agnostic approach will provide insights into the influence of perturbations within the omics network on medical phenome and identify key omics phenotypes and pathways that are shared by diseases thereby providing candidate targets for therapeutic intervention.

References: 1. Shin, S.-Y. et al. Nat. Genet. (2014) 2. Long, T. et al. Nat. Genet. (2017) 3. Kettunen, J. et al. Nat. Commun. (2016) 4. Suhre, K. et al. Nat. Commun. (2017) 5. Suhre, K. et al. Nature (2011) 6. Draisma, H. H. M. et al. Nat. Commun. (2015) 7. Sarwar et.al. Lancet. (2012) 8. Ferreira et.al PLoS Genetics (2013)

Technical Summary

Recent studies have been successful in identifying the association of genetic variants and potentially causal genes with various proteins, metabolites and lipids and their influence on key biological pathways that are associated with diseases [1-6]. In addition, there is increasing evidence of genetic overlap between unrelated diseases and traits that point to a shared aetiology of diseases [7-8].
The proposed work will focus on constructing GGMs for the multi-omics data (proteins, metabolites and lipids) with edges representing the partial correlation between two phenotype measures conditioned for other variables within the model. Meta-data which include genetic associations with multi-omics phenotypes from Genome-Wide Association Studies (GWAS), the genetic association with diseases from public databases (PhenoScanner, SNiPA etc.) and biological pathways (KEGG) will be added to the network to allow the identification of molecular pathways and disease modules using supervised learning methods. This will be particularly useful as an automated tool to detect disease modules within a biological network and inform mendelian randomisation studies (MR) to understand the shared aetiology of diseases. These genetically influenced disease modules can be tested for association across Electronic Health Record (HER) phenotypes (eg. Diagnosis, presence or absence of multi-morbidity and drug responses). This agnostic approach will provide insights into the influence of perturbations within the omics network on medical phenome and identify key omics phenotypes and pathways that are shared by diseases thereby providing candidate targets for therapeutic intervention.
References: 1. Shin, S.-Y. et al. Nat.Genet. (2014) 2. Long, T. et al. Nat.Genet. (2017) 3. Kettunen, J. et al. Nat.Comm. (2016) 4. Suhre, K. et al. Nat. Comm. (2017) 5. Suhre, K. et al. Nature (2011) 6. Draisma, H. H. M. et al. Nat. Comm. (2015) 7. Sarwar et.al. Lancet. (2012) 8. Ferreira et.al PLoS Gen. (2013)
 
Description EPIC-Norfolk 
Organisation Helmholtz Zentrum München
Country Germany 
Sector Public 
PI Contribution Conducted the largest genetic analysis of non targetted metabolomics to date in collaboration with the partners listed above. The outcomes of this research is now being prepared for publication in a high impact journal.
Collaborator Contribution EPIC-Norfolk contributed approximately 12,000 samples to the genetic analysis. Prof. Karsten Suhre provided extensive advise on research, Dr. Gabi Kastenmüller and Dr. Johannes Raffler provided computation biology support to develop a webserver for the dissemination of results.
Impact 2018 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Finalist for the presentation of the research work - Genetic Architecture of Human Plasma Metabolome. The study identified approximately 2,500 unique genetic variants - blood metabolite association and discovered many novel pathways that are under genetic control.
Start Year 2018
 
Description EPIC-Norfolk 
Organisation University of Cambridge
Department Institute of Metabolic Science (IMS)
Country United Kingdom 
Sector Academic/University 
PI Contribution Conducted the largest genetic analysis of non targetted metabolomics to date in collaboration with the partners listed above. The outcomes of this research is now being prepared for publication in a high impact journal.
Collaborator Contribution EPIC-Norfolk contributed approximately 12,000 samples to the genetic analysis. Prof. Karsten Suhre provided extensive advise on research, Dr. Gabi Kastenmüller and Dr. Johannes Raffler provided computation biology support to develop a webserver for the dissemination of results.
Impact 2018 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Finalist for the presentation of the research work - Genetic Architecture of Human Plasma Metabolome. The study identified approximately 2,500 unique genetic variants - blood metabolite association and discovered many novel pathways that are under genetic control.
Start Year 2018
 
Description EPIC-Norfolk 
Organisation Vanderbilt University
Country United States 
Sector Academic/University 
PI Contribution Conducted the largest genetic analysis of non targetted metabolomics to date in collaboration with the partners listed above. The outcomes of this research is now being prepared for publication in a high impact journal.
Collaborator Contribution EPIC-Norfolk contributed approximately 12,000 samples to the genetic analysis. Prof. Karsten Suhre provided extensive advise on research, Dr. Gabi Kastenmüller and Dr. Johannes Raffler provided computation biology support to develop a webserver for the dissemination of results.
Impact 2018 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Finalist for the presentation of the research work - Genetic Architecture of Human Plasma Metabolome. The study identified approximately 2,500 unique genetic variants - blood metabolite association and discovered many novel pathways that are under genetic control.
Start Year 2018
 
Description EPIC-Norfolk 
Organisation Weill Cornell Medical College in Qatar
Country Qatar 
Sector Academic/University 
PI Contribution Conducted the largest genetic analysis of non targetted metabolomics to date in collaboration with the partners listed above. The outcomes of this research is now being prepared for publication in a high impact journal.
Collaborator Contribution EPIC-Norfolk contributed approximately 12,000 samples to the genetic analysis. Prof. Karsten Suhre provided extensive advise on research, Dr. Gabi Kastenmüller and Dr. Johannes Raffler provided computation biology support to develop a webserver for the dissemination of results.
Impact 2018 Charles J. Epstein Trainee Award for Excellence in Human Genetics Research Finalist for the presentation of the research work - Genetic Architecture of Human Plasma Metabolome. The study identified approximately 2,500 unique genetic variants - blood metabolite association and discovered many novel pathways that are under genetic control.
Start Year 2018