Determining the causal links and clinical significance of rare genetic variants

Lead Research Organisation: University of Edinburgh

Department Name: MRC Human Genetics Unit

Abstract

GWAS have identified many common genetic variants associated with various traits and diseases. However most of the individual effects of common variants at the trait level are very small, requiring very large sample sizes (i.e. 10's of thousands) for detection, with associations found providing low accuracy predictions of an individual's liability to disease or outcomes after treatment. Genetic effects of common variants on intermediate phenotypes, such as gene expression or protein concentrations, are often much larger than those on traits and diseases and such associations are thus often detectable in smaller samples of hundreds or thousands of individuals. Combining information from large studies of disease outcomes and smaller transcriptomic or proteomic studies in a two-sample Mendelian randomisation study can be used to provide evidence for a causal path from DNA variation through gene expression to a disease outcome. Even so, the relatively small genetic effects of common variants on phenotypic traits make them difficult to study in the small functional studies feasible in the laboratory.
Some genetic variants have larger genetic effects than those common variants detected by GWAS, but such variants are often kept at low frequency within individual pedigrees by natural selection as a consequence of their larger effects on individual fitness. Such variants are hence difficult to detect in cosmopolitan studies of unrelated individuals, but may become detectable in studies of pedigreed populations, especially where small founder population size and drift may enhance the frequency of otherwise rare variants. Nonetheless such variants are unlikely to be in LD with and hence captured by associations with SNPs on standard arrays. Rare variants of large effect are most likely to be located within or close to expressed genes. Hence using DNA sequence from the exome and adjacent regions is a good strategy to capture such variants.
In this project we propose to link proteomic data with the exome variants to detect locally (i.e. cis) acting genetic effects on protein concentrations.

Technical Summary

Large studies of disease outcomes combined with studies of intermediate phenotypes (transcriptome, proteome) in two-sample Mendelian randomisation (MR) can be used to provide evidence for a causal path from DNA variation through gene expression to disease outcome. Even so, their small phenotypic effect size makes common genetic variants detected by GWAS in cosmopolitan populations difficult to follow up functionally. Genetic variants of larger effects size are rare in populations of largely unrelated individuals, but may become detectable in isolate populations with higher levels of kinship. Such variants are unlikely to be captured by standard SNP genotyping and so require more genome sequence-level analysis.
This project will identify novel and rare variants in whole-exome sequence data that have cis-acting effects on protein abundance in 8000 individuals from isolate populations from the Northern Isles of Scotland and from Croatia. Evidence for the role of protein expression in disease aetiology will be explored using two sample MR. These studies will include re-contact of cohort participants carrying rare variants with large predicted phenotypic effects for a more detailed phenotypic assessment and generation of tractable biological samples. Further study will include exploiting proteomic and other 'omic data to build graphical predictive models for individual traits or diseases incorporating multiple loci.

Funded Value:

£254,657

Funded Period:

Feb 18 - Feb 21

Funder:

MRC

Project Status:

Closed

Project Category:

Fellowship

Project Reference:

MR/R026408/1

Principal Investigator:

Lucija Klaric

Health Category:

Unclassified

Organisations

People	ORCID iD
Lucija Klaric (Principal Investigator / Fellow)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 5 > >|

10 25 50

COVID-19 Host Genetics Initiative (2021) Mapping the human genetic architecture of COVID-19. in Nature

Cvetko A (2020) Glycosylation Alterations in Multiple Sclerosis Show Increased Proinflammatory Potential. in Biomedicines

Fawcett K (2020) Variants associated with HHIP expression have sex-differential effects on lung function in Wellcome Open Research

Fawcett KA (2020) Variants associated with HHIP expression have sex-differential effects on lung function. in Wellcome open research

Fawcett, K. A. (2020) Variants associated with; HHIP; expression have sex-differential effects on lung function

Frkatovic A (2021) Genetic Regulation of Immunoglobulin G Glycosylation. in Experientia supplementum (2012)

Frkatovic-Hodžic A (2023) Mapping of the gene network that regulates glycan clock of ageing

Frkatovic-Hodžic A (2023) Mapping of the gene network that regulates glycan clock of ageing in Aging

Gilly A (2022) Gene-based whole genome sequencing meta-analysis of 250 circulating proteins in three isolated European populations. in Molecular metabolism

Halachev M (2019) Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions

Policy Influence
Research Databases and Models
Collaboration
Engagement Activities


Description	Instructor on Course Introduction to Statistics for MRC HGU IGMM postgraduate students
Geographic Reach	Local/Municipal/Regional
Policy Influence Type	Influenced training of practitioners or researchers


Description	Lecture on "Omics" as part of Genetic Epidemiology course for Masters of Public Health
Geographic Reach	Local/Municipal/Regional
Policy Influence Type	Influenced training of practitioners or researchers


Title	GWAS and meta-analysis identifies 49 genetic variants underlying critical Covid-19
Description	Full GWAS data are browsable and available for download at https://genomicc.org/data cohorts.xlsx - Tabular details of geno. typing methods, phenotypes and other characteristics of each cohort included in meta-analyses credible_sets.xlsx - Full table of credible sets of variants obtained from fine-mapping for each genome-wide significant region. gene_level.xlsx - Full results of gene-level analysis, aggregating statistical data for all variants relating to each protein-coding gene. GSMR_eQTL.tsv - Full results from GSMR analysis for RNA expression. GSMR_pQTL.tsv - Full results from GSMR analysis for protein level. twas.xlsx - Full results for colocalisation and TWAS analyses in lung, blood, monocytes and across multiple tissue types.
Type Of Material	Database/Collection of data
Year Produced	2023
Provided To Others?	Yes
URL	https://springernature.figshare.com/articles/dataset/GWAS_and_meta-analysis_identifies_49_genetic_va...


Title	GWAS and meta-analysis identifies 49 genetic variants underlying critical Covid-19
Description	Full GWAS data are browsable and available for download at https://genomicc.org/data cohorts.xlsx - Tabular details of geno. typing methods, phenotypes and other characteristics of each cohort included in meta-analyses credible_sets.xlsx - Full table of credible sets of variants obtained from fine-mapping for each genome-wide significant region. gene_level.xlsx - Full results of gene-level analysis, aggregating statistical data for all variants relating to each protein-coding gene. GSMR_eQTL.tsv - Full results from GSMR analysis for RNA expression. GSMR_pQTL.tsv - Full results from GSMR analysis for protein level. twas.xlsx - Full results for colocalisation and TWAS analyses in lung, blood, monocytes and across multiple tissue types.
Type Of Material	Database/Collection of data
Year Produced	2023
Provided To Others?	Yes
URL	https://springernature.figshare.com/articles/dataset/GWAS_and_meta-analysis_identifies_49_genetic_va...


Title	Interactive tool for exploring genetic regulation of Immunoglobulin G glycosylation
Description	Interactive tool for exploring genome-wide associations of IgG glycosylation and secondary data created for the manuscript on genetic regulation of IgG glycosylation.
Type Of Material	Database/Collection of data
Year Produced	2018
Provided To Others?	Yes
Impact	This tool enables easy and quick query of genetic associations with IgG glycosylation. It is aimed at helping wet-lab scientists, glycomics enthusiasts or general public to easily look up for which genetic variants are involved in IgG glycosylation and what is their potential role in diseases and complex traits.
URL	https://shiny.igmm.ed.ac.uk/igg_glycans_gwas/


Title	Summary statistics for genome-wide association study of transferrin and IgG glycosylation
Description	Post-translational modifications diversify protein functions and dynamically coordinate their signalling networks, influencing most aspects of cell physiology. Nevertheless, their genetic regulation or influence on complex traits is not fully understood. Here, we compare for the first time the genetic regulation of the same PTM of two proteins - glycosylation of transferrin and immunoglobulin G (IgG). By performing genome-wide association analysis of transferrin glycosylation, we identified 10 significantly associated loci, 9 of which were novel. Comparing these with IgG glycosylation-associated genes, we note protein-specific associations with genes encoding glycosylation enzymes (transferrin - MGAT5, ST3GAL4, B3GAT1; IgG - MGAT3, ST6GAL1), as well as shared associations (FUT6, FUT8). Colocalisation analyses of the latter suggest that different causal variants in the FUT genes regulate fucosylation of the two proteins. Glycosylation of these proteins is thus genetically regulated by both shared and protein specific mechanisms.
Type Of Material	Database/Collection of data
Year Produced	2021
Provided To Others?	Yes
Impact	These summary statistics can be used by other researchers for PheWAS, Mendelian Randomisation, Polygenic Risk Score and other similar analyses
URL	https://datashare.ed.ac.uk/handle/10283/4088


Title	Summary statistics of genome-wide association studies of Immunoglobulin G glycosylation
Description	The majority of proteins undergo post-translational glycosylation, in which complex carbohydrates are attached to the surface of proteins. These can affect protein structure and function, as is the case with Immunoglobulin G, whose effector functions are regulated by the composition of the carbohydrate. Aberrant glycosylation of IgG has been observed in many diseases, but little is understood about the mechanisms behind these changes. This dataset contains summary-level statistics of the largest genome-wide association study of IgG N-glycosylation to date (N=8,090).
Type Of Material	Database/Collection of data
Year Produced	2018
Provided To Others?	Yes
Impact	This dataset will enable other scientist to perform Mendelian Randomisation analyses for assessment of causality of IgG glycosylation in various diseases and complex traits, analyses of genetic correlations, meta-analyses and any other analyses suitable for summary-level association data.
URL	https://datashare.is.ed.ac.uk/handle/10283/3238


Description	Biomarkers causal for severe form of COVID-19
Organisation	Imperial College London
Country	United Kingdom
Sector	Academic/University
PI Contribution	I have arranged access to University of Edinburgh High Performance Computing, collected proteomic genome-wide association studies (GWAS) summary statistics from SCALLOP consortium partners, supervised and contributed to creation of new proteomic GWAS summary statistics and performed Mendelian Randomisation analyses to search for proteomic biomarkers causal for severe form of COVID-19.
Collaborator Contribution	The partners have contributed a short list of biomarker candidates, statistical expertise, they have helped with creating pipelines and provided intellectual input on the subject.
Impact	One manuscript has been published on the pre-print server: https://www.medrxiv.org/content/10.1101/2021.04.01.21254789v1
Start Year	2020


Description	GenOMICC
Organisation	University of Edinburgh
Department	The Roslin Institute
Country	United Kingdom
Sector	Academic/University
PI Contribution	In June 2020 I joined Kenny Baillie's GenOMICC team as a data analyst with expertise in genome-wide association studies (GWAS). I provided consultancy and have run analyses that contributed to a publication Genetic mechanism of critical illness in COVID-19. We are currently collaborating on continuation of that study and a follow-up study where I am leading a work on finding proteomic biomarkers causing the critical illness.
Collaborator Contribution	The GenOMICC team has collected samples and genotyped critically ill patients all over UK and has performed GWAS summary statistics which I have used for downstream analyses.
Impact	This collaboration currently resulted in several manuscripts: https://doi.org/10.1038/s41586-020-03065-y https://doi.org/10.1038/s41586-021-03767-x https://doi.org/10.1038/s41586-022-04576-6 https://doi.org/10.1101/2022.03.07.22271833 and there are other manuscripts in preparation.
Start Year	2020


Description	Genetic regulation of protein glycosylation
Organisation	Genos Glycoscience Laboratory
Country	Croatia
Sector	Private
PI Contribution	I am collaborating with this company in researching genetic regulation of protein glycosylation. I am advising on statistical data cleaning of the glycome data and share my knowledge and experience in genome-wide association studies. Together with their data analyst we defined new glycosylation traits that have a more direct biological interpretation and higher potential for translation into clinical practice. Genome-wide association analyses of this new phenotype were performed on the University's high performance computing cluster Eddie.
Collaborator Contribution	Genos Glycoscience Laboratory is a research-intensive SME that specialises in high-throughput glycosylation studies. They provided new glycosylation datasets that complement well the existing omics available in the group, making our cohorts one of the richest cohorts omics-wise. The lead data analyst on the glycomics genome-wide association studies project is an employee of the company.
Impact	https://doi.org/10.1093/hmg/ddz054 https://doi.org/10.1101/2021.05.04.442584 https://doi.org/10.1093/hmg/ddab335 Several other manuscripts are in preparation.
Start Year	2018


Description	Rare exonic variants in Scottish and Croatian genetic isolates
Organisation	Regeneron Pharmaceuticals, Inc.
Country	United States
Sector	Private
PI Contribution	The overall goal of the collaboration is to elucidate the contribution of rare exonic variants to complex traits of public health importance. My role in the project is preparation and sharing of phenotype data, discovery of novel rare variants and association analyses. We also provide expertise regarding the phenotype data.
Collaborator Contribution	Regeneron Genetics Center provided exome sequencing data on 4000 individuals from our cohorts and expertise in cleaning and analysing this data.
Impact	No specific outputs at this stage.
Start Year	2018


Description	SCALLOP
Organisation	SCALLOP Consortium
Sector	Private
PI Contribution	I work on proteomic genome-wide association studies (GWAS), both contributing to other partners projects by providing summary statistics and advising on aspects of analyses and by leading a project on GWAS on cardiometabolic OLINK panel.
Collaborator Contribution	Partners in the SCALLOP consortium perform analyses on the proteomic data in their own cohorts and then share their summary statistics, protocols and analysis plans with other cohorts. In the case of my particular collaboration partners have contributed GWAS summary statistics data, statistical advice and data-transfer infrastructure.
Impact	There are several manuscripts in preparation, first of which has been published on a pre-print server: https://www.medrxiv.org/content/10.1101/2021.08.03.21261494v1
Start Year	2020


Description	Interview for a magazine
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Media (as a channel to the public)
Results and Impact	I was interviewed by a Croatian lifestyle magazine about my career and women in STEM. The interview had a good reach, with more than 400 shares and following the release there was a positive feedback from school teachers that shared the story with their students.
Year(s) Of Engagement Activity	2021
URL	https://www.telegram.hr/zivot/kao-mala-je-htjela-izucavati-lavove-ali-ih-u-samoboru-nije-bilo-sada-r...


Description	Interview for evening newspaper
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	I gave an interview that was reported on the cover of the top 3 read Croatian newspaper about the Genetic mechanisms of critical illness in COVID-19. Following the interview there was increased interest in genetic studies and their potential for drug repurposing.
Year(s) Of Engagement Activity	2020
URL	https://www.vecernji.hr/techsci/hrvatska-znanstvenica-otkrila-sve-o-istrazivanju-koje-bi-trebalo-ola...


Description	Interview for national news
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	In December 2020 I gave an interview for Croatian national news, which was aired at central news on Saturday evening. I described the research and findings of my research into the Genetic mechanisms of critical illness in COVID-19. In the following days I got further interview requests and there was increased interest in genetic research and its potential for drug repurposing.
Year(s) Of Engagement Activity	2020
URL	https://dnevnik.hr/vijesti/koronavirus/u-pronalasku-gena-koji-vode-do-sklonosti-teskom-obliku-covida...


Description	Interview for news paper
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	I gave an interview about the Genetic mechanisms of critical illness in COVID-19 for Croatian newspapers. Following the interview there was an increased interest in genetic studies and their potential for drug repurposing.
Year(s) Of Engagement Activity	2020
URL	https://slobodnadalmacija.hr/vijesti/hrvatska/hrvatica-koja-je-sudjelovala-u-razotkrivanju-velike-mi...


Description	Publication related press release
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	I wrote a layman piece describing my research published in Science Advances ("Sugar-coating the inflammatory response? Glycosylation of IgG is regulated by a large network of genes with effects on inflammatory diseases") aimed at disseminating the research to wide audiences. The article was viewed more than 50 times on my LinkedIn page and was then re-used for University and Institute press release, increasing the visibility of glycobiology field and understanding its importance in human diseases.
Year(s) Of Engagement Activity	2020
URL	https://www.linkedin.com/pulse/sugar-coating-inflammatory-response-glycosylation-igg-lucija-klaric/?...


Description	Talk at the Science Festival (Orkney)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	Around 100 people were present at the Orkney Science Festival talk about DNA sequencing and what we can learn from it. The audience was very engaged and discussions continued even after the talk.
Year(s) Of Engagement Activity	2018
URL	https://www.ed.ac.uk/viking/whats-new/events/orkney-international-science-festival

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications