The Genetics of Symptom Severity in COVID-19 Infections
Lead Research Organisation:
European Bioinformatics Institute
Department Name: Birney Research
Abstract
Copy number variation (CNV) is an important class of genetic variation that can have large impacts on human health. There is a significant amount of knowledge on genomic disorders caused by rare CNVs at specific locations in the genome and a good amount of evidence into the role of common CNVs across a variety of human traits. To date progress on large scale CNV association studies from Exome or Genome sequence data has been limited by methodological constraints and/or technological limitations. The majority of CNV associations studies have been performed using SNP genotyping arrays which suffer from low CNV resolution and limited dose response. We have recently developed methods to allow large scale copy number association tests from Exome sequences within the UK Biobank. These methods result in high resolution (exon level) CNV information with good association discovery signal for human traits. We have run copy number association testing for a variety of human traits and have generated robust results verifying some important regions within genes previously known to impact these traits. Furthermore, these analyses have resulted in new findings that have not be described previously but show great promise in terms of, for example, gene function.
We are undertaking research involving genetic association testing for SNPs and CNVs in COVID-19 patients using data from the UK Biobank and Genomics England. Our CNV association methods can be scaled up to generate results across extremely large whole Exome and whole Genome sequencing datasets. Importantly, we are able to correlate GWAS findings between these two classes of genetic variation where, in some cases, when certain SNPs can 'well tag' specific CNVs the differences in human trait distributions could be discovered by the two approaches independently. However, there are a large number of CNVs that cannot be well tagged by single or multiple SNPs and it is these associations that would not be found using other approaches. Additionally, the interplay between SNPs and CNVs adds important information into the understanding of human traits and to the genetics of differences in symptom severity seen in covid-19 infection. We are actively developing SNP-CNV imputation methods such that we can built robust imputation models for CNV locations across the genome. This allows us to impute copy number information into further SNP genotyping cohorts worldwide adding considerable value to the global association testing efforts for covid-19. Many groups across the world are actively undertaking genetic research into covid-19 susceptibility using large scale GWAS analysis from SNP genotyping arrays. However, there are far fewer who can quickly and effectively leverage the CNV information available from whole Exome and Genome sequence data to allow genome wide association testing of CNVs. It is clear that differences in copy number can cause big differences in human traits and influence health. CNVs are certain to be one of the genetic sources of differences that we observe across the wide range of responses to covid-19 infection.
Results from this project will contribute to an improved understanding of the genetic basis of differences in symptom severity of covid-19 cases. There are likely to be a large number of specific risk factors based on rare variants in the human population that confer an increased risk of severe symptoms. It is unclear whether there will be a single (or small number) of highly significant genetic variants with large effect sizes that predispose individuals to an increased risk of severe symptoms. It is however likely that a large number of rare or combinations of rare and common genetic variants may lower an individual's robustness to covid-19 infection overall. It is entirely possible that commonly observed CNVs may associate with differences in covid-19 symptom severity and this is an important research area that could have a high impact globally.
We are undertaking research involving genetic association testing for SNPs and CNVs in COVID-19 patients using data from the UK Biobank and Genomics England. Our CNV association methods can be scaled up to generate results across extremely large whole Exome and whole Genome sequencing datasets. Importantly, we are able to correlate GWAS findings between these two classes of genetic variation where, in some cases, when certain SNPs can 'well tag' specific CNVs the differences in human trait distributions could be discovered by the two approaches independently. However, there are a large number of CNVs that cannot be well tagged by single or multiple SNPs and it is these associations that would not be found using other approaches. Additionally, the interplay between SNPs and CNVs adds important information into the understanding of human traits and to the genetics of differences in symptom severity seen in covid-19 infection. We are actively developing SNP-CNV imputation methods such that we can built robust imputation models for CNV locations across the genome. This allows us to impute copy number information into further SNP genotyping cohorts worldwide adding considerable value to the global association testing efforts for covid-19. Many groups across the world are actively undertaking genetic research into covid-19 susceptibility using large scale GWAS analysis from SNP genotyping arrays. However, there are far fewer who can quickly and effectively leverage the CNV information available from whole Exome and Genome sequence data to allow genome wide association testing of CNVs. It is clear that differences in copy number can cause big differences in human traits and influence health. CNVs are certain to be one of the genetic sources of differences that we observe across the wide range of responses to covid-19 infection.
Results from this project will contribute to an improved understanding of the genetic basis of differences in symptom severity of covid-19 cases. There are likely to be a large number of specific risk factors based on rare variants in the human population that confer an increased risk of severe symptoms. It is unclear whether there will be a single (or small number) of highly significant genetic variants with large effect sizes that predispose individuals to an increased risk of severe symptoms. It is however likely that a large number of rare or combinations of rare and common genetic variants may lower an individual's robustness to covid-19 infection overall. It is entirely possible that commonly observed CNVs may associate with differences in covid-19 symptom severity and this is an important research area that could have a high impact globally.
Publications
Fitzgerald T
(2022)
CNest: A novel copy number association discovery method uncovers 862 new associations from 200,629 whole-exome sequence datasets in the UK Biobank.
in Cell genomics
Vöhringer HS
(2022)
Publisher Correction: Genomic reconstruction of the SARS CoV-2 epidemic in England.
in Nature
Description | From this work we have provided insight into the genetics of COVID severity. This we hope will be published in 2022 as a pre-print after more data from the GenomiCC consortium comes in (currently we are at the edge of power). |
Exploitation Route | This work informs future drug development work on COVID. |
Sectors | Healthcare |
Description | We have provided policy and epidemic advice to UK government and other governments informed by this work |
First Year Of Impact | 2021 |
Sector | Healthcare |
Impact Types | Societal,Policy & public services |
Description | Joint project with GeL/GenomicCC on testing the copy number burden and DNA methylation |
Geographic Reach | National |
Policy Influence Type | Participation in a guidance/advisory committee |
Impact | The work has lead to direct advice on genetic impacts on COVID, eg, to SAGE and the SAGE transmission group. An example of negative data here is the *lack* of support for Vitamin D in being causally involved in COVID severity. A more complex scenario is the involvement of different allele frequencies in genetics underlying the different observed ethnicity prevelance. Although there are different allele frequencies for COVID severity, these allele frequency skews are far more localised globally than the distribution of self identified ethnicity (eg, "South Asian") and so shows the importance of non-genetic effects (as well as some genetic effects). |
Description | Member of the International Best Practice Advisory Group |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Implementation circular/rapid advice/letter to e.g. Ministry of Health |
Impact | We in particular were informed on the vitamin D and COVID debate around genetics. We informed the UK analysis of the B.1.1.7 Strain levels We provide advice to French, German, Indian Governments. |
Description | Collaboration with the GenomicCC consortium |
Organisation | University of Edinburgh |
Department | Edinburgh Clinical Trials Unit |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Joining the large consoritum of exploring the genetics of COVID led by Kenneth Ballie |
Collaborator Contribution | We are providing both copy number variation and DNA methylation analysis |
Impact | Working together on a joint paper |
Start Year | 2022 |
Title | CNest |
Description | Calculates Copy Number levels in a normalised manner. |
Type Of Technology | Software |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | We have run this on the UK BioBank as proof of principle and we have run this on the GeL/GenomicCC consortium datasets for COVID. |