The Mexican Biobank Project: Building Capacity for Big Data Science in Medical Genomics in Admixed Populations

Lead Research Organisation: University of Oxford
Department Name: Wellcome Trust Centre for Human Genetics

Abstract

Technological innovation is a major driving force of a nation's economic growth. Unfortunately, such rapid advances frequently exceed society's capability of assimilation and translation into applicable benefits. Genomic sciences, propelled by the explosive growth of DNA-based technologies and the advent of the Big Data revolution, are at the forefront of this innovation. However, in order for these advances to benefit emerging economies, their academic communities must develop the expertise for big genomic data generation and analysis - reducing the scientific gap that exists between developed and developing countries.

Such academic development can be fostered via international collaborative research. We propose to undertake the genetic profiling of the most comprehensive DNA Biobank available of the Mexican population to date. By doing so, we will build the Mexican capacity for embarking and leading large-scale genomic projects, including local data generation, training of human resources, attraction of top-level scientists in cutting-edge science, and translation into health care improvements. Genetic profiling and deep phenotyping are powerful tools that help better understand individuals' variation associated with disease and tackle population-specific health problems. Nation-wide initiatives such as that of the UK Biobank and the Faroe Genome Project, seek to enable a future in which healthcare is guided by the genetic makeup of its population. However, most studies and methods aimed at elucidating the relationships between health and genetic variation are being undertaken in predominantly European cohorts and thus may not be readily applicable to an admixed population such as that of Mexico.

In order to establish the most inclusive genomic repository of a Latin American population, we propose to screen the largest nationwide Mexican Biobank, comprising 40,000 DNA samples collected as part of the Encuesta Nacional de Salud (ENSA) in 2000 [1]. Each of these samples has associated clinical, economic, sociological and epidemiological data that will enable the undertaking of a range of genetic studies including genome wide association studies (GWAS).

As a proof of concept, we will first focus on replicating the identification of variants known to be associated with immune and metabolic syndromes present in the surveyed cohort. Specifically, we will measure antibody responses to a range of pathogens, many of which are widely prevalent in the Mexican population and which have recognised associations with variants in the human leukocyte antigen locus of the human genome. We will also re-test genetic associations with cardiometabolic traits that are available in the ENSA Biobank (e.g blood glucose and body-mass index).

By undertaking genetic association studies using multiple phenotypes measured in most individuals participating in ENSA 2000 we will be able to recapitulate the approach used by other large-scale initiatives such as UK Biobank. Our study will be particularly well poised to not only replicate these signals discovered in Caucasian populations but we will be able to harness the uniquely admixed structure of the Mexican population to characterize these associations in fine detail.

This project will help solidify Mexico's budding genomic sovereignty through the development of endogenous research capacity. Our final aim is to build a foundation for future genomic research incorporating ethical acquisition, storage, genotyping, sequencing and analysis all self-contained within developing countries such as Mexico. We foresee that these practices will create jobs for highly specialized professionals in Mexico and, in the long term, improve healthcare for the general population. Our aspiration is to establish a Roadmap that will further the autonomy of other developing economies with similarly admixed populations and ensure that such technology advances become truly global.

Planned Impact

This study represents a major opportunity to build the first genetically diverse biobank in a Latino admixed population.

Given its scope, the primary routes of impact for this study are:

1) Increasing Mexican genomic governance through the generation of Mexican human capital that are able to generate and analysis biomedical population data in Mexico.
2) Generation of candidates for future biomedical investigation related to immunogenic sensitivity to commonly occurring pathogens.
3) Academic outreach activities including:
a) The publication of scientific research articles in peer-reviewed journals describing the biomedical and population knowledge arising from this study.
b) Dissemination of results in international conferences emphasising the bilateral contributions of the institutions involved.
c) Carrying out a workshop to train other Latin America based academics in the statistical analysis techniques developed during the course of this project.
4) Community outreach activities including:
a) The publication of articles describing our findings in popular magazines accessible to the general population (e.g. Muy Interesante in Mexico and New Scientist in the UK).
b) Generating a web portal describing the findings and their impact to the general population as well as developing a user-friendly interface for the general public to generate quick surveys of the data, with a strong emphasis on highlighting the benefits of the bilateral collaboration between Mexico and the UK.
c) Organising a workshop aimed at engaging public policy makers and health practitioners to help them understand the long term relevance and potential future implications of our findings to both healthcare and public policy in admixed populations. Particular emphasis will be given to the importance of having a characterization of the population background prior to concluding that discoveries made in less admixed populations are readily applicable to the Mexican population.

The great potential of revealing several biomedically relevant associations via this study may, in the long term, foster the emergence of programs aimed at developing population specific treatments and diagnostics in admixed populations. This may in turn boost investment in population based industry initiatives in genomics in Mexico and/or Latin America.

Publications

10 25 50
 
Description Mexican Biobank collaboration 
Organisation Centre for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV)
Country Mexico 
Sector Academic/University 
PI Contribution Through the generous Newton Fund Award we have made excellent progress in our planned work, completing the genotyping of a large cohort of Mexican individuals from the ENSA2000 Biobank collection. This is the largest country-wide collection of biological samples and phenotype information available for Mexico. Our genotyped samples have been selected to be representative of the complex population structure of the Mexican population. The biobank is being established. Genotyping is in progress and data analysis has begun. Work on serological analysis of Mexican samples for evidence of various infections is planned.
Collaborator Contribution Dr Andres Moreno at LANGEBIO Cinvestav, Mexico have established a new Mexican Biobank project with MRC and other support and the Wellcome Human Genetics centre is a major collaborator through my lab.
Impact A great deal of work is ongoing with exchange visits but no published papers as yet
Start Year 2017