Determining the causal links and clinical significance of rare genetic variants

Lead Research Organisation: University of Edinburgh
Department Name: MRC Human Genetics Unit

Abstract

GWAS have identified many common genetic variants associated with various traits and diseases. However most of the individual effects of common variants at the trait level are very small, requiring very large sample sizes (i.e. 10's of thousands) for detection, with associations found providing low accuracy predictions of an individual's liability to disease or outcomes after treatment. Genetic effects of common variants on intermediate phenotypes, such as gene expression or protein concentrations, are often much larger than those on traits and diseases and such associations are thus often detectable in smaller samples of hundreds or thousands of individuals. Combining information from large studies of disease outcomes and smaller transcriptomic or proteomic studies in a two-sample Mendelian randomisation study can be used to provide evidence for a causal path from DNA variation through gene expression to a disease outcome. Even so, the relatively small genetic effects of common variants on phenotypic traits make them difficult to study in the small functional studies feasible in the laboratory.
Some genetic variants have larger genetic effects than those common variants detected by GWAS, but such variants are often kept at low frequency within individual pedigrees by natural selection as a consequence of their larger effects on individual fitness. Such variants are hence difficult to detect in cosmopolitan studies of unrelated individuals, but may become detectable in studies of pedigreed populations, especially where small founder population size and drift may enhance the frequency of otherwise rare variants. Nonetheless such variants are unlikely to be in LD with and hence captured by associations with SNPs on standard arrays. Rare variants of large effect are most likely to be located within or close to expressed genes. Hence using DNA sequence from the exome and adjacent regions is a good strategy to capture such variants.
In this project we propose to link proteomic data with the exome variants to detect locally (i.e. cis) acting genetic effects on protein concentrations.

Technical Summary

Large studies of disease outcomes combined with studies of intermediate phenotypes (transcriptome, proteome) in two-sample Mendelian randomisation (MR) can be used to provide evidence for a causal path from DNA variation through gene expression to disease outcome. Even so, their small phenotypic effect size makes common genetic variants detected by GWAS in cosmopolitan populations difficult to follow up functionally. Genetic variants of larger effects size are rare in populations of largely unrelated individuals, but may become detectable in isolate populations with higher levels of kinship. Such variants are unlikely to be captured by standard SNP genotyping and so require more genome sequence-level analysis.
This project will identify novel and rare variants in whole-exome sequence data that have cis-acting effects on protein abundance in 8000 individuals from isolate populations from the Northern Isles of Scotland and from Croatia. Evidence for the role of protein expression in disease aetiology will be explored using two sample MR. These studies will include re-contact of cohort participants carrying rare variants with large predicted phenotypic effects for a more detailed phenotypic assessment and generation of tractable biological samples. Further study will include exploiting proteomic and other 'omic data to build graphical predictive models for individual traits or diseases incorporating multiple loci.

Publications

10 25 50
 
Description Instructor on Course Introduction to Statistics for MRC HGU IGMM postgraduate students
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Lecture on "Omics" as part of Genetic Epidemiology course for Masters of Public Health
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
 
Title Interactive tool for exploring genetic regulation of Immunoglobulin G glycosylation 
Description Interactive tool for exploring genome-wide associations of IgG glycosylation and secondary data created for the manuscript on genetic regulation of IgG glycosylation. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact This tool enables easy and quick query of genetic associations with IgG glycosylation. It is aimed at helping wet-lab scientists, glycomics enthusiasts or general public to easily look up for which genetic variants are involved in IgG glycosylation and what is their potential role in diseases and complex traits. 
URL https://shiny.igmm.ed.ac.uk/igg_glycans_gwas/
 
Title Summary statistics for genome-wide association study of transferrin and IgG glycosylation 
Description Post-translational modifications diversify protein functions and dynamically coordinate their signalling networks, influencing most aspects of cell physiology. Nevertheless, their genetic regulation or influence on complex traits is not fully understood. Here, we compare for the first time the genetic regulation of the same PTM of two proteins - glycosylation of transferrin and immunoglobulin G (IgG). By performing genome-wide association analysis of transferrin glycosylation, we identified 10 significantly associated loci, 9 of which were novel. Comparing these with IgG glycosylation-associated genes, we note protein-specific associations with genes encoding glycosylation enzymes (transferrin - MGAT5, ST3GAL4, B3GAT1; IgG - MGAT3, ST6GAL1), as well as shared associations (FUT6, FUT8). Colocalisation analyses of the latter suggest that different causal variants in the FUT genes regulate fucosylation of the two proteins. Glycosylation of these proteins is thus genetically regulated by both shared and protein specific mechanisms. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact These summary statistics can be used by other researchers for PheWAS, Mendelian Randomisation, Polygenic Risk Score and other similar analyses 
URL https://datashare.ed.ac.uk/handle/10283/4088
 
Title Summary statistics of genome-wide association studies of Immunoglobulin G glycosylation 
Description The majority of proteins undergo post-translational glycosylation, in which complex carbohydrates are attached to the surface of proteins. These can affect protein structure and function, as is the case with Immunoglobulin G, whose effector functions are regulated by the composition of the carbohydrate. Aberrant glycosylation of IgG has been observed in many diseases, but little is understood about the mechanisms behind these changes. This dataset contains summary-level statistics of the largest genome-wide association study of IgG N-glycosylation to date (N=8,090). 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact This dataset will enable other scientist to perform Mendelian Randomisation analyses for assessment of causality of IgG glycosylation in various diseases and complex traits, analyses of genetic correlations, meta-analyses and any other analyses suitable for summary-level association data. 
URL https://datashare.is.ed.ac.uk/handle/10283/3238
 
Description Biomarkers causal for severe form of COVID-19 
Organisation Imperial College London
Country United Kingdom 
Sector Academic/University 
PI Contribution I have arranged access to University of Edinburgh High Performance Computing, collected proteomic genome-wide association studies (GWAS) summary statistics from SCALLOP consortium partners, supervised and contributed to creation of new proteomic GWAS summary statistics and performed Mendelian Randomisation analyses to search for proteomic biomarkers causal for severe form of COVID-19.
Collaborator Contribution The partners have contributed a short list of biomarker candidates, statistical expertise, they have helped with creating pipelines and provided intellectual input on the subject.
Impact One manuscript has been published on the pre-print server: https://www.medrxiv.org/content/10.1101/2021.04.01.21254789v1
Start Year 2020
 
Description GenOMICC 
Organisation University of Edinburgh
Department The Roslin Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution In June 2020 I joined Kenny Baillie's GenOMICC team as a data analyst with expertise in genome-wide association studies (GWAS). I provided consultancy and have run analyses that contributed to a publication Genetic mechanism of critical illness in COVID-19. We are currently collaborating on continuation of that study and a follow-up study where I am leading a work on finding proteomic biomarkers causing the critical illness.
Collaborator Contribution The GenOMICC team has collected samples and genotyped critically ill patients all over UK and has performed GWAS summary statistics which I have used for downstream analyses.
Impact This collaboration currently resulted in several manuscripts: https://doi.org/10.1038/s41586-020-03065-y https://doi.org/10.1038/s41586-021-03767-x https://doi.org/10.1038/s41586-022-04576-6 https://doi.org/10.1101/2022.03.07.22271833 and there are other manuscripts in preparation.
Start Year 2020
 
Description Genetic regulation of protein glycosylation 
Organisation Genos Glycoscience Laboratory
Country Croatia 
Sector Private 
PI Contribution I am collaborating with this company in researching genetic regulation of protein glycosylation. I am advising on statistical data cleaning of the glycome data and share my knowledge and experience in genome-wide association studies. Together with their data analyst we defined new glycosylation traits that have a more direct biological interpretation and higher potential for translation into clinical practice. Genome-wide association analyses of this new phenotype were performed on the University's high performance computing cluster Eddie.
Collaborator Contribution Genos Glycoscience Laboratory is a research-intensive SME that specialises in high-throughput glycosylation studies. They provided new glycosylation datasets that complement well the existing omics available in the group, making our cohorts one of the richest cohorts omics-wise. The lead data analyst on the glycomics genome-wide association studies project is an employee of the company.
Impact https://doi.org/10.1093/hmg/ddz054 https://doi.org/10.1101/2021.05.04.442584 https://doi.org/10.1093/hmg/ddab335 Several other manuscripts are in preparation.
Start Year 2018
 
Description Rare exonic variants in Scottish and Croatian genetic isolates 
Organisation Regeneron Pharmaceuticals, Inc.
Country United States 
Sector Private 
PI Contribution The overall goal of the collaboration is to elucidate the contribution of rare exonic variants to complex traits of public health importance. My role in the project is preparation and sharing of phenotype data, discovery of novel rare variants and association analyses. We also provide expertise regarding the phenotype data.
Collaborator Contribution Regeneron Genetics Center provided exome sequencing data on 4000 individuals from our cohorts and expertise in cleaning and analysing this data.
Impact No specific outputs at this stage.
Start Year 2018
 
Description SCALLOP 
Organisation SCALLOP Consortium
Sector Private 
PI Contribution I work on proteomic genome-wide association studies (GWAS), both contributing to other partners projects by providing summary statistics and advising on aspects of analyses and by leading a project on GWAS on cardiometabolic OLINK panel.
Collaborator Contribution Partners in the SCALLOP consortium perform analyses on the proteomic data in their own cohorts and then share their summary statistics, protocols and analysis plans with other cohorts. In the case of my particular collaboration partners have contributed GWAS summary statistics data, statistical advice and data-transfer infrastructure.
Impact There are several manuscripts in preparation, first of which has been published on a pre-print server: https://www.medrxiv.org/content/10.1101/2021.08.03.21261494v1
Start Year 2020
 
Description Interview for a magazine 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Media (as a channel to the public)
Results and Impact I was interviewed by a Croatian lifestyle magazine about my career and women in STEM. The interview had a good reach, with more than 400 shares and following the release there was a positive feedback from school teachers that shared the story with their students.
Year(s) Of Engagement Activity 2021
URL https://www.telegram.hr/zivot/kao-mala-je-htjela-izucavati-lavove-ali-ih-u-samoboru-nije-bilo-sada-r...
 
Description Interview for evening newspaper 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact I gave an interview that was reported on the cover of the top 3 read Croatian newspaper about the Genetic mechanisms of critical illness in COVID-19. Following the interview there was increased interest in genetic studies and their potential for drug repurposing.
Year(s) Of Engagement Activity 2020
URL https://www.vecernji.hr/techsci/hrvatska-znanstvenica-otkrila-sve-o-istrazivanju-koje-bi-trebalo-ola...
 
Description Interview for national news 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact In December 2020 I gave an interview for Croatian national news, which was aired at central news on Saturday evening. I described the research and findings of my research into the Genetic mechanisms of critical illness in COVID-19. In the following days I got further interview requests and there was increased interest in genetic research and its potential for drug repurposing.
Year(s) Of Engagement Activity 2020
URL https://dnevnik.hr/vijesti/koronavirus/u-pronalasku-gena-koji-vode-do-sklonosti-teskom-obliku-covida...
 
Description Interview for news paper 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact I gave an interview about the Genetic mechanisms of critical illness in COVID-19 for Croatian newspapers. Following the interview there was an increased interest in genetic studies and their potential for drug repurposing.
Year(s) Of Engagement Activity 2020
URL https://slobodnadalmacija.hr/vijesti/hrvatska/hrvatica-koja-je-sudjelovala-u-razotkrivanju-velike-mi...
 
Description Publication related press release 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I wrote a layman piece describing my research published in Science Advances ("Sugar-coating the inflammatory response? Glycosylation of IgG is regulated by a large network of genes with effects on inflammatory diseases") aimed at disseminating the research to wide audiences. The article was viewed more than 50 times on my LinkedIn page and was then re-used for University and Institute press release, increasing the visibility of glycobiology field and understanding its importance in human diseases.
Year(s) Of Engagement Activity 2020
URL https://www.linkedin.com/pulse/sugar-coating-inflammatory-response-glycosylation-igg-lucija-klaric/?...
 
Description Talk at the Science Festival (Orkney) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Around 100 people were present at the Orkney Science Festival talk about DNA sequencing and what we can learn from it. The audience was very engaged and discussions continued even after the talk.
Year(s) Of Engagement Activity 2018
URL https://www.ed.ac.uk/viking/whats-new/events/orkney-international-science-festival