Environment-adjusted genetic analysis methods for cardiometabolic traits in African populations
Lead Research Organisation:
UNIVERSITY OF CAMBRIDGE
Department Name: MRC Biostatistics Unit
Abstract
There is substantial interest in understanding the underlying biology by which genetic variants impact on disease or disease-relevant measurements (e.g. cholesterol levels), as there is evidence that this could lead to better disease treatment and prevention. There has been great success in identifying hundreds of genetic variants associated with many diseases and traits, but very few of these variants have a well-understood role in disease biology. Also, a detected variant does not necessarily contribute to effects in the trait, since it may instead have a high correlation with the variant that causes the effect (i.e. high correlation with the causal variant). The majority of these studies are based on individuals of European ancestry and, in contrast, African ancestry populations are under-represented, accounting for only 2% of individuals in studies. This focus on European ancestries limits the global utility and a high risk of inaccuracies or errors in the translation of genetic research into clinical practice or public health policy.
African ancestral cohorts have a high level of genetic structural differences between them - two different European ancestral cohorts are more genetically similar to each other than two different African ancestral cohorts are to each other. This increases the challenge in selecting a single representative measure of the correlation between genetic variants that is appropriate for multiple African ancestries. This measure is needed to construct sets of genetic variants that are likely to contain the true causal variant underlying a genetic association. Current strategies tend to use a measure based on two African ancestries from a publicly available reference panel (1000 Genomes). We will construct two alternatives using genetic data from East, West, and South African ancestries. This will be of use to our proposed analyses, and will also be made freely available for others to improve their analyses of African ancestries.
Another challenge in genetic studies undertaken in different African populations is that there are environmental exposure differences between them that could have an impact on disease-related traits. An example is infection markers of diseases, such as malaria. Current methods for identifying genetic variants associated with traits and the fine-tuning of potential causal variants do not account for any environmental exposures, and doing so could lead to better detection of associations and greater accuracy. We propose environment-adjusted methods for detecting associations and selecting potential causal variants using information from one trait, as well as sharing information between traits to further reduce the set of potential causal variants. These identified potential causal variants will then be used to construct genetic risk scores (GRS) for African ancestries.
GRS could contribute to assessing a person's risk level for developing a disease. The majority of GRS are based on European ancestries and are unlikely to be transferable to African ancestries. This leads us to derive GRS based on our environmental-adjusted results and compare these to those based on European ancestries to examine the transferability of GRS between ancestries.
All methods will be freely available on-line in user-friendly software for others to use in their own analyses. We will also provide an on-line database of African ancestry reference panels for use in other African genetic studies. These are expected to be of use to both methodological and applied researchers.
African ancestral cohorts have a high level of genetic structural differences between them - two different European ancestral cohorts are more genetically similar to each other than two different African ancestral cohorts are to each other. This increases the challenge in selecting a single representative measure of the correlation between genetic variants that is appropriate for multiple African ancestries. This measure is needed to construct sets of genetic variants that are likely to contain the true causal variant underlying a genetic association. Current strategies tend to use a measure based on two African ancestries from a publicly available reference panel (1000 Genomes). We will construct two alternatives using genetic data from East, West, and South African ancestries. This will be of use to our proposed analyses, and will also be made freely available for others to improve their analyses of African ancestries.
Another challenge in genetic studies undertaken in different African populations is that there are environmental exposure differences between them that could have an impact on disease-related traits. An example is infection markers of diseases, such as malaria. Current methods for identifying genetic variants associated with traits and the fine-tuning of potential causal variants do not account for any environmental exposures, and doing so could lead to better detection of associations and greater accuracy. We propose environment-adjusted methods for detecting associations and selecting potential causal variants using information from one trait, as well as sharing information between traits to further reduce the set of potential causal variants. These identified potential causal variants will then be used to construct genetic risk scores (GRS) for African ancestries.
GRS could contribute to assessing a person's risk level for developing a disease. The majority of GRS are based on European ancestries and are unlikely to be transferable to African ancestries. This leads us to derive GRS based on our environmental-adjusted results and compare these to those based on European ancestries to examine the transferability of GRS between ancestries.
All methods will be freely available on-line in user-friendly software for others to use in their own analyses. We will also provide an on-line database of African ancestry reference panels for use in other African genetic studies. These are expected to be of use to both methodological and applied researchers.
Technical Summary
Our five aims address the high genetic diversity of African ancestries and their environmental exposures (e.g. infection markers of malaria) that likely impact the variability of disease-related traits. Current methods for detecting genetic associations and fine-mapping do not account for environmental exposures; such adjustments should improve both detection power of genetic associations and improve fine-mapping resolution. Our proposed methods only need genome-wide association studies (GWAS) summary data and will be accompanied by software.
The proposed environment-adjusted meta-regression of GWAS includes covariates that account for differences in environmental exposures and genetic background allele between ancestries. This framework allows testing each variant for association across all GWAS, and to also identify any heterogeneity of effects among the cohorts.
Fine-mapping of genetic associations relies on a representative LD matrix. There has not yet been an assessment of the appropriateness of the common strategy of using an LD matrix based on the 1000 Genomes African ancestries. We propose an alternative LD matrix based on East, West, and South African ancestral cohorts and consider two different approaches to its construction.
Upon identifying loci that have genetic associations with at least one trait, we proceed to fine-mapping, adjusted for environmental exposures - two single-trait fine-mapping approaches are proposed (i) stepwise conditioning; (ii) Bayesian variable selection. For loci that have genetic associations from multiple traits, we propose an environment-adjusted multi-trait fine-mapping approach in a Bayesian framework. Genetic risk scores based on these results are expected to provide more precision over those that do not adjust for environment exposure and are based on single traits.
Performance of methods will be assessed by simulation studies, and they will be applied to cardiometabolic traits from unique African ancestral cohorts.
The proposed environment-adjusted meta-regression of GWAS includes covariates that account for differences in environmental exposures and genetic background allele between ancestries. This framework allows testing each variant for association across all GWAS, and to also identify any heterogeneity of effects among the cohorts.
Fine-mapping of genetic associations relies on a representative LD matrix. There has not yet been an assessment of the appropriateness of the common strategy of using an LD matrix based on the 1000 Genomes African ancestries. We propose an alternative LD matrix based on East, West, and South African ancestral cohorts and consider two different approaches to its construction.
Upon identifying loci that have genetic associations with at least one trait, we proceed to fine-mapping, adjusted for environmental exposures - two single-trait fine-mapping approaches are proposed (i) stepwise conditioning; (ii) Bayesian variable selection. For loci that have genetic associations from multiple traits, we propose an environment-adjusted multi-trait fine-mapping approach in a Bayesian framework. Genetic risk scores based on these results are expected to provide more precision over those that do not adjust for environment exposure and are based on single traits.
Performance of methods will be assessed by simulation studies, and they will be applied to cardiometabolic traits from unique African ancestral cohorts.
Publications

Wang S
(2024)
Accounting for heterogeneity due to environmental sources in meta-analysis of genome-wide association studies.
in Communications biology
Description | Environment-adjusted meta-regression of lipids traits in African ancestry populations |
Organisation | Harvard University |
Department | Harvard Medical School |
Country | United States |
Sector | Academic/University |
PI Contribution | Under this grant, Siru Wang has joined my group as an RA (Dec 2022), and Oyesola Ojewunmi, jointly supervised by myself and Segun Fatumo (LSHTM, MRC Uganda), has joined Segun Fatumo's group in LSHTM (joined Oct 2023). Siru and I have developed the env-MR-MEGA method and designed simulation studies, with input from Andrew Morris (Manchester). Siru has carried out extensive simulation studies to evaluate our new method and compare with the original MR-MEGA method, and she has developed an R library for our new method. We have also given guidance to Oyesola on the use of env-MR-MEGA for the analysis of LDL cholesterol in 12 sex-stratified cohorts from Africa (total sample size approximately 20,000). This method only requires summary-level data from the cohorts. |
Collaborator Contribution | Andrew Morris (Manchester) has given input on the methods development and simulation design, as well as practical aspects in the analysis of the 12 sex-stratified cohorts from Africa. Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand) have given input on practical aspects in the analysis of the 12 sex-stratified cohorts from Africa and have acquired the data for these analyses. Tinashe Chikowore has carried out some of the genome-wide association studies (GWAS) within each cohort and has given guidance on carrying out GWAS in remaining cohorts by Oyesola Ojewunmi and a member of his group. |
Impact | This collaboration includes co-applicants and collaborators for this grant. It is a multi-disciplinary collaboration involving methodology expertise from my team and from Andrew Morris (Manchester) and expertise in cardiometabolic traits and diversity amongst African ancestry populations from Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand). Siru Wang will present this research at the European Society of Human Genetics annual meeting 2024 and we will submit this manuscript (with software) for publication in the coming weeks. |
Start Year | 2023 |
Description | Environment-adjusted meta-regression of lipids traits in African ancestry populations |
Organisation | Medical Research Council (MRC) |
Department | MRC/UVRI and LSHTM Research Unit Uganda |
Country | Uganda |
Sector | Academic/University |
PI Contribution | Under this grant, Siru Wang has joined my group as an RA (Dec 2022), and Oyesola Ojewunmi, jointly supervised by myself and Segun Fatumo (LSHTM, MRC Uganda), has joined Segun Fatumo's group in LSHTM (joined Oct 2023). Siru and I have developed the env-MR-MEGA method and designed simulation studies, with input from Andrew Morris (Manchester). Siru has carried out extensive simulation studies to evaluate our new method and compare with the original MR-MEGA method, and she has developed an R library for our new method. We have also given guidance to Oyesola on the use of env-MR-MEGA for the analysis of LDL cholesterol in 12 sex-stratified cohorts from Africa (total sample size approximately 20,000). This method only requires summary-level data from the cohorts. |
Collaborator Contribution | Andrew Morris (Manchester) has given input on the methods development and simulation design, as well as practical aspects in the analysis of the 12 sex-stratified cohorts from Africa. Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand) have given input on practical aspects in the analysis of the 12 sex-stratified cohorts from Africa and have acquired the data for these analyses. Tinashe Chikowore has carried out some of the genome-wide association studies (GWAS) within each cohort and has given guidance on carrying out GWAS in remaining cohorts by Oyesola Ojewunmi and a member of his group. |
Impact | This collaboration includes co-applicants and collaborators for this grant. It is a multi-disciplinary collaboration involving methodology expertise from my team and from Andrew Morris (Manchester) and expertise in cardiometabolic traits and diversity amongst African ancestry populations from Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand). Siru Wang will present this research at the European Society of Human Genetics annual meeting 2024 and we will submit this manuscript (with software) for publication in the coming weeks. |
Start Year | 2023 |
Description | Environment-adjusted meta-regression of lipids traits in African ancestry populations |
Organisation | University of Manchester |
Department | School of Biological Sciences |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Under this grant, Siru Wang has joined my group as an RA (Dec 2022), and Oyesola Ojewunmi, jointly supervised by myself and Segun Fatumo (LSHTM, MRC Uganda), has joined Segun Fatumo's group in LSHTM (joined Oct 2023). Siru and I have developed the env-MR-MEGA method and designed simulation studies, with input from Andrew Morris (Manchester). Siru has carried out extensive simulation studies to evaluate our new method and compare with the original MR-MEGA method, and she has developed an R library for our new method. We have also given guidance to Oyesola on the use of env-MR-MEGA for the analysis of LDL cholesterol in 12 sex-stratified cohorts from Africa (total sample size approximately 20,000). This method only requires summary-level data from the cohorts. |
Collaborator Contribution | Andrew Morris (Manchester) has given input on the methods development and simulation design, as well as practical aspects in the analysis of the 12 sex-stratified cohorts from Africa. Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand) have given input on practical aspects in the analysis of the 12 sex-stratified cohorts from Africa and have acquired the data for these analyses. Tinashe Chikowore has carried out some of the genome-wide association studies (GWAS) within each cohort and has given guidance on carrying out GWAS in remaining cohorts by Oyesola Ojewunmi and a member of his group. |
Impact | This collaboration includes co-applicants and collaborators for this grant. It is a multi-disciplinary collaboration involving methodology expertise from my team and from Andrew Morris (Manchester) and expertise in cardiometabolic traits and diversity amongst African ancestry populations from Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand). Siru Wang will present this research at the European Society of Human Genetics annual meeting 2024 and we will submit this manuscript (with software) for publication in the coming weeks. |
Start Year | 2023 |
Description | Environment-adjusted meta-regression of lipids traits in African ancestry populations |
Organisation | University of the Witwatersrand |
Department | Faculty of Health Sciences |
Country | South Africa |
Sector | Academic/University |
PI Contribution | Under this grant, Siru Wang has joined my group as an RA (Dec 2022), and Oyesola Ojewunmi, jointly supervised by myself and Segun Fatumo (LSHTM, MRC Uganda), has joined Segun Fatumo's group in LSHTM (joined Oct 2023). Siru and I have developed the env-MR-MEGA method and designed simulation studies, with input from Andrew Morris (Manchester). Siru has carried out extensive simulation studies to evaluate our new method and compare with the original MR-MEGA method, and she has developed an R library for our new method. We have also given guidance to Oyesola on the use of env-MR-MEGA for the analysis of LDL cholesterol in 12 sex-stratified cohorts from Africa (total sample size approximately 20,000). This method only requires summary-level data from the cohorts. |
Collaborator Contribution | Andrew Morris (Manchester) has given input on the methods development and simulation design, as well as practical aspects in the analysis of the 12 sex-stratified cohorts from Africa. Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand) have given input on practical aspects in the analysis of the 12 sex-stratified cohorts from Africa and have acquired the data for these analyses. Tinashe Chikowore has carried out some of the genome-wide association studies (GWAS) within each cohort and has given guidance on carrying out GWAS in remaining cohorts by Oyesola Ojewunmi and a member of his group. |
Impact | This collaboration includes co-applicants and collaborators for this grant. It is a multi-disciplinary collaboration involving methodology expertise from my team and from Andrew Morris (Manchester) and expertise in cardiometabolic traits and diversity amongst African ancestry populations from Segun Fatumo (LSHTM, MRC Uganda), Tinashe Chikowore (Witwatersrand, Harvard), and Michele Ramsay (Witwatersrand). Siru Wang will present this research at the European Society of Human Genetics annual meeting 2024 and we will submit this manuscript (with software) for publication in the coming weeks. |
Start Year | 2023 |
Title | env-MR-MEGA |
Description | This is an R library for our new statistical method, environment-adjusted meta-regression (env-MR-MEGA). This method assesses genetic variant associations by adjusting for differing environmental exposures between populations. Additionally, env-MR-MEGA quantifies the extent of heterogeneity due to ancestral and environmental effects. It requires only GWAS summary data and cohort-level environmental exposures. The related manuscript has not been finalised, and we will make this software publically available when we submit to the bioRxiv preprint server. |
Type Of Technology | Software |
Year Produced | 2024 |
Open Source License? | Yes |
Impact | too soon to comment |
Description | Big Biology Day science festival 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Participation in Big Biology Day 2023 - presenting bespoke hands-on activity based on statistical research taking place at the BSU. |
Year(s) Of Engagement Activity | 2023 |
Description | Quinquennial Review (QQR) of the MRC Biostatistics Unit 2023 |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I am a member of the senior leadership team for the Causal Mechanisms (CM) research theme in our Unit and was part of the CM discussion panel at the Unit's QQR. In our session, I responded to questions from the assessment panel regarding research in the Unit, particularly my plans for additional multi-ancestry methods development and large-scale analyses with collaborators. As an example of research in the Unit, Feng Zhou (RA hired under my CDA funding) presented a poster to the assessment panel on a user-friendly interactive web tool that allows researchers with little programming experience to explore genetic associations of traits within UK Biobank and to run their own multi-trait analyses (flashfm method developed in my group) to prioritise likely causal variants that are shared and distinct between traits. Both the method and initiative to make such analyses easily accessible to a wider range of researchers were of high interest. Within the Unit's QQR progress review, my flashfm multi-trait fine-mapping method and its applications was a featured case study. Working towards increasing health equity for understudied populations from Africa, and with collaborators from Africa, we conducted joint multi-trait fine-mapping with flashfm of lipids traits among 125,000 individuals of African ancestry. This resulted in a mean 18% reduction in credible set size and revealed potential causal variants not detected by single-trait fine-mapping. Our interpretations and plots in this paper were simplified by our flashfm-ivis R shiny app (Zhou Bioinfor 2022), which enables interactive visualisations of flashfm results and may be used by professionals with little programming experience. Despite initial publication in only 2021, flashfm is already in use by at least seven research groups from four countries, including Barroso (Exeter) for the joint analysis of glycaemic traits (Soenksen Diab 2021). |
Year(s) Of Engagement Activity | 2023 |
Description | RA volunteer at Big Biology Day science festival 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Participation by Siru Wang in Big Biology Day 2023 - presenting bespoke hands-on activity based on statistical research taking place at the BSU. |
Year(s) Of Engagement Activity | 2023 |