Using integrative omics to disentangle causal relationships between tissue-specific pathways and coronary artery disease

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Molecular. Genetics & Pop Health

Abstract

The aim of this project is to try and understand which pathways are involved in the different tissues that affect various complex genetic diseases and traits. The project will involve development of a method for analysing this question using GWAS SNPs as a genetic proxy and utilising mendelian randomisation (MR) to obtain a score for pathway functionality in various traits.

The basic idea for this method is to generate polygenic risk scores (PRSs) for the overall functionality of the different pathways involved in each tissue of interest. This would provide an idea of how the pathway is functioning in the tissue of interest and therefore also its contribution to the trait of interest. Thus we would be generating a score for how much various pathways are affected in different tissues and therefore a score of much the trait is affected by the pathway in the specific tissue. This has very clear potential benefits in precision medicine in that its application to various individuals could highlight, from their genotype, which pathways are most strongly affected in their disease and allow for a more targeted approach for therapy.

In our theory we assume there are two factors which will affect pathway functionality in different tissues: 1. The varying expression level of the different genes encoding enzymes and cofactors involved in the pathway. 2. How well each various protein and enzyme involved in the pathway is actually working (this can be linked to non-synonymous variants). However, the second factor seems to be rarely as important in common genetic traits, so it is the regulation of gene expression which seems to be the most important aspect and so we will focus on this.

We will therefore be looking at single nucleotide polymorphisms (SNPs) which have an effect on gene expression, expression quantitative trait loci (eQTLs), in different tissues. We will generate a score for each gene which has been significantly affected in the tissue of interest utilising cis-eQTLs from numerous different data sources (such as GTEx). This step may also involve a comparison of existing methods for creating PRSs from GWAS data, such as LDpred, in order to generate a single score from numerous SNPs for one gene.

Once a score has been generated for each gene in the tissue, all the affected genes will be grouped by pathway (potentially using frameworks such as Reactome, again potentially comparing resources) to create an overall "functionality score" for how important the pathway is in the tissue of interest. This will also involve a step of weighting the importance of each gene and protein product in the pathway. To do this we will fit a multivariate regression using all the scores for the pathway and fit it against a protein outcome which will be the defined "end point" of the pathway. I.e. varying levels of this end point protein will be used as an indicator for variation in the pathway.

These pathway scores will be trained in one dataset and then tested in a further two datasets. Lastly these validated scores can and will be applied to specific diseases such as coronary artery disease (CAD) in order to determine the effect of each pathway on the outcome.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
MR/N013166/1 01/10/2016 30/09/2025
2106193 Studentship MR/N013166/1 01/09/2018 31/05/2022 Sebastian May-Wilson