A systems based approach to integrating genetic and longitudinal omics data to support diagnosis and prediction of common chronic disease

Lead Research Organisation: King's College London
Department Name: Genetics and Molecular Medicine


New technologies are providing opportunities to measure health and disease in many novel ways. The data produced is complex and hard to decipher even by clinicians and health workers. This proposal will investigate how we can use modern molecular techniques which measure in blood the activity and expression of genes and the signatures of chemical reactions (metabolites) in the cell to help predict early disease. To do this we need to explore how global gene expression and metabolites alter over time and how these longitudinal changes along with other new molecular and genetic techniques (called omics) can be used to explore disease mechanisms and susceptibility in ageing populations. To explore the biology of "omic" variability, and to lay the foundation for the clinical integration of genetic and genomic data, we will investigate the longitudinal relationships of cellular and genomic phenotypes, including global gene expression and metabolites, in 700 twins over 7 years, measured at three time-periods. The study subjects derive from the TwinsUK cohort on whom there is already extensive clinical information and cross-sectional genetic and genomic data. Building on these existing data, and making use of the specific methodological opportunities and advantages afforded by the twin design, we will explore how these genomic traits track and vary over time, determine how such variation relates to underlying genetic variation, and explore the joint contribution of genetic and genomic data to disease risk and onset. We will also explore the potential value of monitoring changes in these and other situations within an integrated personalised medicine framework. We will use and develop new analysis approaches to integrate these complex data sets and suggest which changes might play a role in clinically relevant tratis and disease itself. This study will provide novel insights into disease understanding and stimulate larger-scale efforts to combine modern genetic and genomic data for clinical benefit in the future. These studies will pave the way for individualised medicine.

Technical Summary

This proposal will investigate the longitudinal relationships of cellular and genomic phenotypes, including global gene expression and metabolites, in 700 twins over 7 years, measured at three time-periods. To increase compliance some of the blood sampling will be performed by post as well as by routine visits. We have been using postal blood methods for the last ten years and found it works well for the assays without loss of quality and local GPs are happy to cooperate. So far 98% of samples arrive within 48 hours of blood draw. We will use RNA seq in fasting whole blood for the transcriptomics to be performed in Geneva and non-targeted Mass Spec Metabolon platform for the metabolomics performed by Metabolon Inc in the USA. We have extensively used both methods previously. The study subjects derive from the extensive TwinsUK cohort (www.twinsUK.ac.uk/phenotypes) on whom there is already extensive clinical information and cross-sectional genetic and genomic data. We will use and develop systems analysis approaches to integrate these complex data sets and make causal inferences. In particular, we will develop novel robust analytical approaches, which will combine methods for investigating multidimensional association between sets of high dimensional data with dimension reduction approaches and characterisation of internal co-variation structure in each data set. Initial cross-sectional integrative analysis will be extended to the longitudinal set-up to investigate changes in omics profiles, and predictive modelling of clinically relevant traits incorporating several types of omics phenotypes will be performed.

Planned Impact

The principal beneficiaries of the research will be:
i) Academics as outlined in the "academic beneficiaries" section;
ii) Industry and biotechnology companies, in a position to exploit the improved biological understanding we will provide to develop novel products (see below);
iii) The public sector (NHS, policy-makers), provided the research generates translational advances that provide more cost effective means of managing disease;
iv) The wider public, if those translational advances provide more effective strategies for the prediction, treatment and prevention of chronic diseases.

The academic benefits will be manifest through:
i) The generation of new knowledge related to chronic disease progression with the potential to contribute to amelioration of the social, economic and personal costs of these conditions
ii) The development of a unique biosample and data set available to other researchers;
iii) The development and promulgation of novel methods for analysis of complex longitudinal data sets;
iv) The aggregation through collaboration of additional expertise in this area;
v) Improved training of researchers in the specific areas of research activity, and in the development of cross-disciplinary expertise.

The broader economic and social impact will be manifest through:
i) Economic benefits to pharma and biotechnology companies (including "spin-outs" with potential for attracting "inwards" investment) able to exploit actionable translational opportunities with respect to the development of novel prognostic and
therapeutic approaches that build on the associations we detect;
ii) Benefits to those developing and marketing omics assays, in terms of defining additional content, and expanding markets (research initially, but potentially for clinical use);
iii) Improved effectiveness of public services if the biological insights result in better ways of predicting, treating and preventing late-onset disease (novel treatments, better diagnostics, improved strategies for stratifying risk and response to interventions);
iv) Maximising value of next-generation sequencing data collected within systems such as the NHS for other purposes, by augmenting the value of such data for late onset disease prediction;
v) Improved health outcomes (less disease-related morbidity and mortality) if the work leads to effective clinical translation, resulting in further personal, social and economic benefits.

It is important to be realistic about the true timelines for effective clinical translation. As we make clear in the grant, a complete understanding, particularly of the relationships between omics biomarkers and disease-related outcomes, requires accrual of those outcomes over time (thereby strengthening the ability to separate out causal and reactive changes in omics phenotypes). We see the research proposed as providing enablement (in terms of methods, data sets, and paradigms) that will support efforts to integrate genetic and genomic data relevant to late-onset diseases. Despite all the talk of a genomics-based health care "revolution", the list of indications is so far (rightly) focused around cancer, rare diseases, infectious diseases and pharmacogenetics, and a measured description of the value of whole genome sequencing (for example) to common, late-onset disease has been lacking. Based on our extensive experience of complex trait genetics we believe that the value of DNA sequence alone (or any omic alone) will be limited in terms of risk stratification for most diseases. It will require both integration of these different data types, plus additional longitudinal clinical phenotypes (which we argue are best derived from omics data) to achieve the levels of overall specificity and
sensitivity needed to trigger a specific preventative intervention or invasive diagnostic test. We believe this project will play a crucial role in enabling such a reconfiguration of health care delivery.


10 25 50
Description TWINS 2017: The Joint 4th World Congress on Twin Pregnancy and 16th Congress of the International Society of Twin Studies (ISTS) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk at a conference that aims to further research and public education in all fields related to twins and twin studies, for the mutual benefit of twins and their families and the scientific community.
Year(s) Of Engagement Activity 2017
URL https://www.mcascientificevents.eu/wp-content/uploads/2017/05/Scientific-Programme-29.05.pdf
Description World Precision Medicine Congress 2017 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Talk at World Precision Medicine Congress 2017 which brings together academia, industry and healthcare to discuss and debate key challenges and opportunities within: Diagnostics | Big Data | Genomics | Infrastructure | Investment| Rare Diseases | Oncology |Neuroscience |Diabetes|
Year(s) Of Engagement Activity 2017
URL http://www.globaleventslist.elsevier.com/events/2017/05/world-precision-medicine-congress-2017/