Environment-adjusted genetic analysis methods for cardiometabolic traits in African populations

Lead Research Organisation: University of Cambridge
Department Name: MRC Biostatistics Unit

Abstract

There is substantial interest in understanding the underlying biology by which genetic variants impact on disease or disease-relevant measurements (e.g. cholesterol levels), as there is evidence that this could lead to better disease treatment and prevention. There has been great success in identifying hundreds of genetic variants associated with many diseases and traits, but very few of these variants have a well-understood role in disease biology. Also, a detected variant does not necessarily contribute to effects in the trait, since it may instead have a high correlation with the variant that causes the effect (i.e. high correlation with the causal variant). The majority of these studies are based on individuals of European ancestry and, in contrast, African ancestry populations are under-represented, accounting for only 2% of individuals in studies. This focus on European ancestries limits the global utility and a high risk of inaccuracies or errors in the translation of genetic research into clinical practice or public health policy.

African ancestral cohorts have a high level of genetic structural differences between them - two different European ancestral cohorts are more genetically similar to each other than two different African ancestral cohorts are to each other. This increases the challenge in selecting a single representative measure of the correlation between genetic variants that is appropriate for multiple African ancestries. This measure is needed to construct sets of genetic variants that are likely to contain the true causal variant underlying a genetic association. Current strategies tend to use a measure based on two African ancestries from a publicly available reference panel (1000 Genomes). We will construct two alternatives using genetic data from East, West, and South African ancestries. This will be of use to our proposed analyses, and will also be made freely available for others to improve their analyses of African ancestries.

Another challenge in genetic studies undertaken in different African populations is that there are environmental exposure differences between them that could have an impact on disease-related traits. An example is infection markers of diseases, such as malaria. Current methods for identifying genetic variants associated with traits and the fine-tuning of potential causal variants do not account for any environmental exposures, and doing so could lead to better detection of associations and greater accuracy. We propose environment-adjusted methods for detecting associations and selecting potential causal variants using information from one trait, as well as sharing information between traits to further reduce the set of potential causal variants. These identified potential causal variants will then be used to construct genetic risk scores (GRS) for African ancestries.

GRS could contribute to assessing a person's risk level for developing a disease. The majority of GRS are based on European ancestries and are unlikely to be transferable to African ancestries. This leads us to derive GRS based on our environmental-adjusted results and compare these to those based on European ancestries to examine the transferability of GRS between ancestries.

All methods will be freely available on-line in user-friendly software for others to use in their own analyses. We will also provide an on-line database of African ancestry reference panels for use in other African genetic studies. These are expected to be of use to both methodological and applied researchers.

Technical Summary

Our five aims address the high genetic diversity of African ancestries and their environmental exposures (e.g. infection markers of malaria) that likely impact the variability of disease-related traits. Current methods for detecting genetic associations and fine-mapping do not account for environmental exposures; such adjustments should improve both detection power of genetic associations and improve fine-mapping resolution. Our proposed methods only need genome-wide association studies (GWAS) summary data and will be accompanied by software.
The proposed environment-adjusted meta-regression of GWAS includes covariates that account for differences in environmental exposures and genetic background allele between ancestries. This framework allows testing each variant for association across all GWAS, and to also identify any heterogeneity of effects among the cohorts.
Fine-mapping of genetic associations relies on a representative LD matrix. There has not yet been an assessment of the appropriateness of the common strategy of using an LD matrix based on the 1000 Genomes African ancestries. We propose an alternative LD matrix based on East, West, and South African ancestral cohorts and consider two different approaches to its construction.
Upon identifying loci that have genetic associations with at least one trait, we proceed to fine-mapping, adjusted for environmental exposures - two single-trait fine-mapping approaches are proposed (i) stepwise conditioning; (ii) Bayesian variable selection. For loci that have genetic associations from multiple traits, we propose an environment-adjusted multi-trait fine-mapping approach in a Bayesian framework. Genetic risk scores based on these results are expected to provide more precision over those that do not adjust for environment exposure and are based on single traits.
Performance of methods will be assessed by simulation studies, and they will be applied to cardiometabolic traits from unique African ancestral cohorts.

Publications

10 25 50