Identifying non-coding mutations in early-onset diabetes

Lead Research Organisation: University of Exeter
Department Name: University of Exeter Medical School

Abstract

This project will provide new insights into the role of non-coding mutations in disease and the biology of diabetes. It is also highly likely to dramatically and immediately improve the quality of life for patients with diabetes. Ninety-nine percent of the human genome does not code for protein, yet little is known about the contribution of non-coding variation to human disease. Cost and throughput limitations of DNA sequencing, as well as the relative difficultly of interpreting the functional consequences of variants in non-coding regions, has meant that the search for genetic causes of human disease has been focused on the small coding part of the genome. Now, advances in sequencing technology have heralded the arrival of cheap whole genome sequencing so that all 3 million variants in a patient's genome can be rapidly identified. In parallel, enormous advances in epigenomics have allowed detailed annotation of the non-coding genome and many thousands of functional regulatory sequences have been identified. These sequences act as switches, turning genes on and off and determining cell type. There are only a few examples of variants in these DNA switches causing disease, but we have recently shown that bringing together whole genome sequencing with epigenome annotation can identify this underappreciated type of mutation. We showed that variants of a short sequence far from a key pancreas development gene are the commonest cause of children being born without a pancreas. We have also recently shown that variation in some of these regulatory sequences increase risk of late-onset type 2 diabetes. In this project we will use the same approach to identify mutations in regulatory sequences that cause familial early-onset diabetes.

We will sequence the entire genomes of >100 patients with Maturity-onset diabetes of the Young (MODY). MODY is an inherited form of diabetes that is typically diagnosed before the age of 25. We will select these patients from the world's largest collection of "unsolved" MODY families (currently 3000 patients). We will use regulatory sequence annotations that we have derived from pancreas cells. These cells are central to diabetes because they secrete insulin. We will then test whether the same regulatory sequences are mutated in multiple MODY families. We will follow-up variants by testing if these variants track with disease status in the wider family as well as sequencing the putative regulatory elements in the rest of our MODY cohort and in unsolved MODY families from our world-wide network of collaborators. Any variants which have very strong genetic evidence for a role in MODY will be tested to see if there is a functional effect both in vitro using cultured human cells and in vivo in zebrafish which has been demonstrated to be an excellent model for examining pancreatic regulatory elements.

This project will give us a better understanding of the role of non-coding mutations in human disease and allow us to develop approaches to identifying this type of mutation. Our project is important if we are to get the most out of the advent of cheap and widely available whole genome sequencing. Identifying new single gene causes of diabetes is also important because of the potential for immediate benefits to patients. We have shown that patients with the commonest forms of neonatal diabetes and MODY can be well-controlled on tablet treatment rather than insulin injections. Other patients that have mutations in the glucose sensing gene glucokinase can be taken off treatment entirely because they have mildly elevated, stable fasting glucose levels which does not affect health and does not respond to or require treatment. The similarities between MODY and Type 2 diabetes, means that identifying new causes of MODY will provide important new insights into the biology of later-onset, and more common, forms of the disease. This may in the long term lead to the development of novel therapeutics for diabetes.

Technical Summary

We aim to use whole genome sequencing and epigenomic annotation to identify new non-coding causes of monogenic early-onset diabetes. From our cohort of 3000 genetically undiagnosed MODY patients, we will select the 250 most likely to have a monogenic cause of their disease. To do this we will use a validated probability calculator that assigns a likelihood of a patient having MODY by integrating various clinical criteria and biomarker information. We will ensure that coding mutations in all known monogenic diabetes genes are excluded using a targeted sequencing assay of the exons and minimal promoters of these genes. Whole genome sequencing will be performed on at least 100 of the remaining patients. Initially we will test for previously undetectable mutations at known monogenic diabetes genes, as whole genome sequencing provides the opportunity to identify structural, copy number and deep intronic cryptic splice variants. We will test for cis-regulatory mutations by integrating the whole genome sequencing data with epigenomic annotation from two cell types that are key to diabetes, pancreatic islets and pancreatic progenitor cells. We will test whether unrelated families share novel variants in active regulatory elements. We have successfully validated this approach by identifying mutations in an enhancer 25kb downstream of PTF1A as the commonest cause of isolated pancreas agenesis. Genome-wide analyses will be performed to look for regulatory elements at novel genes, as some genes may only cause MODY-like diabetes if tissue-specific regulatory elements are mutated. To prove causality of any particular variant we will test for co-segregation within families, and we will sequence the putative regulatory element in our remaining MODY cohort and patients from our world-wide network of collaborators. For any variant with convincing genetic evidence, we will perform functional studies in cultured human cells and in vivo using transgenesis in zebrafish.

Planned Impact

The findings from this work will have significant impacts in both the short and long term. The major non-academic beneficiaries will be patients with diabetes.

Benefits to patients
The Exeter group has an outstanding record of taking genetic findings into the clinic in both neonatal and maturity-onset diabetes. In this project the immediate impact for patients will come from a genetic diagnosis of their diabetes. This is very likely to lead to treatment change for the patient. For example, if a non-coding mutation at HNF1A or HNF4A is identified this will mean that a patient being currently treated with insulin injections can be tablet treated. In the medium-term, similarly dramatic treatment change might be possible if a novel cause of diabetes is identified. This project will provide important new insights into the molecular biology of diabetes and pancreas formation and function. In the longer term these new insights are likely to lead to improved treatments for patients with other forms of diabetes, including Type 2 diabetes, which is a huge and growing burden on the NHS.

Increasing informatics capacity and making the most of whole genome sequencing data
There is a shortage of bioinformatics expertise in the UK. This is a particularly pressing problem because, with the advances in sequencing technologies, there are moves towards bringing whole genome sequencing into clinical practise. This is evidenced by the recently announced 100,000 genome project in the UK. Our project will increase the skills-base in whole genome sequencing analyses. In particular our work will attempt to make the most out of whole genome sequencing data, by developing approaches and providing a better understanding of the 99% of the genome that doesn't code for protein, which is too often ignored.

Impact on UK science
The University of Exeter, the University of Birmingham and Imperial College London are currently expanding their activities to become world leaders in the areas of genomics, integrative and systems biology. Exeter has invested heavily in STEM subjects. One example of this is the new £27 million Research, Innovation and Knowledge centre that has recently been built at the University of Exeter. This brings together individuals, including the Exeter co-applicants, with expertise in genetics, genomics, epigenetics and diabetes. The research program that we propose here will represent a big step towards increasing the goal of the expansion of genomics and systems medicine research at these universities. It will also help forge links between these institutions in these cutting-edge areas of research. Through teaching on undergraduate and MSc courses, this project will provide exposure to our research and skills to talented students in these areas and serve as a platform for recruitment of the future generation of biological scientists.

Publications

10 25 50

 
Description Diabetes UK grant on developing Type 1 diabetes genetic risk score in South Asians
Amount £241,112 (GBP)
Funding ID 15/0005297 
Organisation Diabetes UK 
Sector Charity/Non Profit
Country United Kingdom
Start 01/2017 
End 01/2020
 
Description Diabetes UK conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This was a presentation by Kashyap Patel at the Diabetes UK Annual conference on the identification of RFX6 as a new MODY gene. Kashyap won a prize as the best clinical talk that year. There was substantial discussion about the topic, and has led to new ideas about how to identify MODY genes. This paper has now been submitted for publication and a draft is currently on BioRxiv.
Year(s) Of Engagement Activity 2016
 
Description Invited talk at the Italian Diabetes Association conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This was an invited talk about how we had developed an approach to identify monogenic diabetes patients using a genetic risk score, which has been an important part of this study so that we can better separate out monogenic from polygenic diabetes to identify new genes. I also discussed RFX6 as a new MODY gene. This generated significant discussion and ideas for future collaborations.
Year(s) Of Engagement Activity 2016