Hydroxy-Sensitive Cut Counting (HSCC); simultaneous, genome-wide mapping of 5-methylcytosine and 5-hydroxymethylcytosine in mammals

Lead Research Organisation: University of Edinburgh
Department Name: Centre for Molecular Medicine

Abstract

The human genome contains all the instructions (genes) necessary to create a human being, beginning from a single fertilised egg cell. Although the same genome sequence is present in all cells, there are hundreds of markedly different cell types, such as brain (neurons), kidney, and liver cells, in the human body. Each cell type must 'turn on' a different subset of genes for the correct functioning of each tissue type. A primary mechanism by which cells mark genes to be turned on or off is called, 'epigenetics'. Epigenetics refers to heritable changes in gene expression that are not caused by mutations in the underlying DNA sequence.

The most comprehensively studied epigenetic mark in mammals is DNA methylation which involves the attachment of a tag-molecule called a 'methyl group' to cytosines to give 5-methylcytosine (5mC). It typically occurs in a CpG dinucleotide context, although non-CpG methylation occurs in embryonic stem cells, between 60-90% of all CpGs are methylated in mammals. DNA methylation is associated with gene silencing and is essential for normal development. Key processes including genomic imprinting, X-chromosome inactivation, suppression of repetitive elements, stability of gene expression and carcinogenesis either depend or involve dynamic changes in DNA methylation patterns. From this perspective it is important to know where modified DNA resides in the genome.

Much of what we know about DNA methylation in mammals is based on a set of techniques which can distinguish between 5mC and an unmethylated cytosine (C). These include the use of methyl-sensitive restriction enzymes (which can cut unmethylated, but not methylated DNA) and bisulfite sequencing, a method that allows accurate quantification of methylation levels at several neighbouring cytosines simultaneously.

This analysis has been confounded by the identification of a new type of modified DNA, 5-hydroxymethylcytosine (5hmC) that is present at high levels in mammalian tissues. Indeed, 5hmC is 40% as common as 5mC in mouse brain samples. Despite intense study of DNA methylation for the last 30 years, the presence of 5hmC in mammalian tissues had been missed especially as most the techniques used to identify methylation, do not differentiate between 5mC and 5hmC.

We propose a radical, new technique termed Hydroxy Sensitive Cut Counting (HSCC) to simultaneously analyse for 5mC, 5hmC and C at the 1.5 -2.3 million CCGG sequences present throughout the mouse and human genomes respectively. HSCC is based on the observation that the restriction enzyme MspI, can cut its target site, CCGG, if the internal C is 5hydroxymethylcytosine, but not if it is beta-glucosyl-5-hydroxymethylcytosine (ghmC). Using a well characterised and commercially available enzyme T4 Phage beta-glucosyltransferase, we will convert all the 5hmC in the genome to ghmC, and digest the sample before and after treatment with MspI. The sequences surrounding each MspI site will then be sequenced using next generation sequencing. The number of sequences in the treated versus untreated samples for each MspI site is a measure of the amount of 5hydroxymethylcytosine present, allowing for semi-quantification of genome-wide 5hmC levels. The resulting 5hmC profiles will represent tissue 'identifier'; a barcode that will be a read-out of a normal tissue state.

HSCC represents a dramatic improvement on the existing 'genome-wide' techniques that use antibodies to harvest DNA containing 5hmC. Such 'affinity' techniques can only assay regions of the genome which contain 5hmC and often exhibit strong sequence-context biases. In contrast, HSCC will assay MspI sites regardless of 5hmC level. This is vital, as knowing where 5hmC is depleted is as important as knowing where it is enriched if we are to understand the role of this exciting new mark in mammalian biology.

Technical Summary

DNA methylation, the covalent attachment of a methyl group directly to the fifth position of cytosine (C) in DNA is the most widely studied epigenetic mark in mammals. DNA methylation is involved in maintaining genome integrity by silencing transposable elements, the process of X-inactivation in females, regulation of allele-specific expression at imprinted loci, and may also contribute to defining tissue-specific gene-expression patterns.

The recent discovery of substantial amounts of 5-hydroxymethylcytosine (5hmC), a derivative of 5-methylcytosine (5mC) in various mouse tissues and human ES cells has necessitated a re-evaluation of our knowledge of 5mC/5hmC patterns and functions in mammalian cells. Indeed, we recently showed that many of the techniques used to assay DNA methylation patterns, including bisulphite sequencing, failed to differentiate between these two modifications. Our current understanding of 5hmC is now seriously limited by the inability to assess genome-wide patterns of 5hmC in an unbiased and quantitative manner.

Here we describe a radical new assay, Hydroxy Cut Counting (HSCC) which will allow determination of 5-mC and 5-hmC levels at over 1.5 million MspI sites throughout the mouse genome. HSCC will exploit the newly described differential sensitivity of the MspI restriction enzyme to its target sequence, CCGG, when the internal cytosine residue is either 5hmC (C5hmCGG; cut) or beta-glucosyl-5-hydroxymethylcytosine, (Cg5hmCGG; uncut). Subsequently EcoP15I digested DNA will be subjected to linker-mediated PCR (LM-PCR) followed by massively parallel sequencing and in silico analyses to determine genome-wide locus-specific 5mC, 5hmC and unmodified C levels in tissues. Whereas, affinity-based techniques, can only reliably sample regions with moderate to high levels of 5hmC and cannot report on regions lacking the mark, HSCC will sample all MspI sites irrespective of 5-hmC levels.

Planned Impact

The role of the newly identified base, 5-hydroxymethylcytosine (5-hmC) is a high impact research topic in the field of epigenetics, including potential epidemiology studies. This application is predicated on developing a new technique, Hydroxy-Sensitive Cut Counting (HSCC), which enables simultaneous, genome-wide mapping of 5-methylcytosine (5-mC) and 5-hmC in mammals. HSCC is a new research tool for epigenetics research, which has the potential to provide significant impact 'to facilitate new biological understanding' in many areas including, transcriptional regulation, stem cell biology, developmental biology, nutritional research and the epidemiology of cognitive ageing.

The development of HSCC represents potential value for money as a tool that can be used to study the DNA modification patterns in population cohorts, which may reflect the transcription profile of the target tissue. These resulting profiles can act as a read-out of past environmental exposures or disease states that result in altered patterns of transcription. The knowledge gained has the potential to improve prosperity and quality of life.

HSCC can address a central question; how dynamic are the epigenetic patterns of DNA modification in the nucleus? The impact of the technique itself would be to contribute to the knowledge base of the areas outlined above. HSCC has the potential to be used by many epigenetic scientists in their future research. This tool is needed, as an inexpensive high resolution genomic mapping data that incorporates C, 5mC and 5hmC profiles is not available at present. The availability of HSCC as a new research tool has the capacity to be part of a turning point in DNA modification research that will lead to new observations, and have significant impact on scientific advances in the area of epigenetics. Beyond the scope of this application are concerns about the epigenetic effects of nutrition, drugs or stress on foetal programming affecting adult health. HSCC represents a 'cutting edge technology' that has the potential to facilitate rapid accumulation of epigenetic profiling in these areas of research.

In the case of our application, knowledge transfer would take place by publishing and presenting our work at local and international meetings. HSCC represents 'timeliness and promise' as there is a growing effort to characterise DNA modification profiles in areas of cell, developmental and population biologys.

Publications

10 25 50
 
Description The human genome contains all the instructions (genes) necessary to create a human being, beginning from a single fertilised egg cell. Although the same genome sequence is present in all cells, there are hundreds of markedly different cell types, such as brain (neurons), kidney, and liver cells, in the human body. Each cell type must 'turn on' a different subset of genes for the correct functioning of each tissue type. A primary mechanism by which cells mark genes to be turned on or off is called, 'epigenetics'. Epigenetics refers to heritable changes in gene expression that are not caused by mutations in the underlying DNA sequence.

The most comprehensively studied epigenetic mark in mammals is DNA methylation which involves the attachment of a tag-molecule called a 'methyl group' to cytosines to give 5-methylcytosine (5mC). It typically occurs in a CpG dinucleotide context, although non-CpG methylation occurs in embryonic stem cells, between 60-90% of all CpGs are methylated in mammals. DNA methylation is associated with gene silencing and is essential for normal development. Key processes including genomic imprinting, X-chromosome inactivation, suppression of repetitive elements, stability of gene expression and carcinogenesis either depend or involve dynamic changes in DNA methylation patterns. From this perspective it is important to know where modified DNA resides in the genome.

Much of what we know about DNA methylation in mammals is based on a set of techniques which can distinguish between 5mC and an unmethylated cytosine (C). These include the use of methyl-sensitive restriction enzymes (which can cut unmethylated, but not methylated DNA) and bisulfite sequencing, a method that allows accurate quantification of methylation levels at several neighbouring cytosines simultaneously. This analysis has been confounded by the identification of a new type of modified DNA, 5-hydroxymethylcytosine (5hmC) that is present at high levels in mammalian tissues. Indeed, 5hmC is 40% as common as 5mC in mouse brain samples. Despite intense study of DNA methylation for the last 30 years, the presence of 5hmC in mammalian tissues had been missed especially as most the techniques used to identify methylation, do not differentiate between 5mC and 5hmC. We propose a radical, new technique termed Hydroxy Sensitive Cut Counting (HSCC) to simultaneously analyse for 5mC, 5hmC and C at the 1.5 -2.3 million CCGG sequences present throughout the mouse and human genomes respectively.

HSCC represents a dramatic improvement on the existing 'genome-wide' techniques that use antibodies to harvest DNA containing 5hmC. Such 'affinity' techniques can only assay regions of the genome which contain 5hmC and often exhibit strong sequence-context biases. In contrast, HSCC will assay MspI sites regardless of 5hmC level. This is vital, as knowing where 5hmC is depleted is as important as knowing where it is enriched if we are to understand the role of this exciting new mark in mammalian biology.
Exploitation Route Ongoing
Sectors Healthcare

 
Description Our work has generated much interest in the form of seminar and meeting invitations (see below) and reports in the scientific media; our 2015 paper reporting the epigenetic changes that occur upon adaptation of primary cell to tissue culture (Nestor et al., Journal/Genome Biology, 2015, 16.). The image we produced was replicated in a number of science news outlets including Genetic and Biotechnology news. See: https://biomedcentral.altmetric.com/details/3311769/news and https://biomedcentral.altmetric.com/details/3311769/twitter
First Year Of Impact 2015
Sector Healthcare
Impact Types Policy & public services

 
Description The Role of Epigenetics in Reproductive Toxicity
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a guidance/advisory committee
Impact AIM The workshop has the following four objectives: 1. Define epigenetics and understand its potential value for reproductive toxicology 2. Understand the relationship between epigenetic change and adverse end points 3. Develop a roadmap to for the practical use of epigenetic studies in regulatory applications 4. Generate a prioritised research agenda A Workshop Report will be developed and it is anticipated to publish the workshop findings in an open access, peer-reviewed journal.
URL http://www.ecetoc.org/wp-content/uploads/2016/07/ECETOC-WR-30.-The-Role-of-Epigenetics-in-Reproducti...
 
Title Rapid reprogramming of epigenetic and transcriptional profiles in mammalian culture systems 
Description In Mouse embryonic fibroblasts; methylation profiling by high throughput sequencing, methylation profiling by array and transcription profiling by array 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
Impact Our 2015 paper reporting the epigenetic changes that occur upon adaptation of primary cell to tissue culture (Nestor et al., Journal/Genome Biology, 2015, 16.) generated much interest in the form of seminar and meeting invitations and reports in the scientific media; The image we produced was replicated in a number of science news outlets including Genetic and Biotechnology news. 
URL https://www.ebi.ac.uk/arrayexpress/search.html?query=+E-MTAB-3172%2C+E-MTAB-3176%2C+E-MTAB-3177%2C+E...
 
Description "DNA methylation and demethylation" Jacques Monod ConfĂ©rence 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Formed new collaborations

Formed new collaborations, got invited to contribute reviews on my research area
Year(s) Of Engagement Activity 2013
URL http://www.cnrs.fr/insb/cjm/cjmbilan_e.html
 
Description 20th International Symposium on Microsomes and Drug Oxidations, 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The symposium is a unique opportunity to meet top ranking professionals from basic science, drug development and clinical pharmacology, and for interdisciplinary communication between academia and industry. An exciting program was presented, featuring keynote and plenary lectures, symposia and poster sessions.

My plenary lecture ( Epigenetic mechanisms in development and disease) was widely discussed and desiminated.
Year(s) Of Engagement Activity 2014
URL http://www.mdo2014.de/Organizing-Committees.416.0.html
 
Description 50th Congress of the European Societies of Toxicology 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The Conference aim was to "Advance Science for Human and Environmental Health. There was a mix of plenary and keynote lectures, symposia, workshops and a full continuing education programme.

Futher invitations to participate in future meetings
Year(s) Of Engagement Activity 2014
URL http://www.eurotox2014.com/
 
Description Bio-IT World and Cambridge Healthtech Institute's Inaugural Clinical Epigenetics workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact ecently, and fueled in part by the application of high-throughput sequencing technologies, our understanding of the disease epigenome has expanded, with the inclusion of numerous mutated genes involved in epigenetic regulation, and epimutations directly involved in producing aberrant gene expression. Epigenetic dysregulation is now a hallmark of several complex pathologies, including cancer, metabolic disorders, cardiovascular and neurological diseases, where disease-specific epigenetic signatures are now being utilized clinically for prognostics, diagnostics as well as disease-specific targeted therapy. This event is designed to unite clinicians and researchers using high-throughput technologies to explore regulatory layers above the genome, ultimately for clinical utility.

Old links renewed, new collaborations formed.
Year(s) Of Engagement Activity 2013
URL http://www.clinicalgenomicsinformatics.com/Clinical-Epigenetics/
 
Description Symposium on Assessing Adverse Epigenetic Effects of Drugs and Chemicals 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact New technological advances have led to recent and rapid progress in the characterization and mapping of epigenetic changes. However, there are significant knowledge gaps in understanding the relevance of such changes for toxicological safety assessment that requires further investigation with respect to the science, including potential model systems, endpoints and the most relevant techniques that could be employed.

Epigenetics, which represents mechanisms that control gene expression in a potentially heritable way without modification of the DNA sequence, is a rapidly developing area of research with relevance to stem cell differentiation, developmental processes, cancer and other diseases. Several mechanisms, including DNA methylation, histone modifications, and regulatory noncoding RNAs, appear to influence gene expression in an epigenetic manner. Epigenetic alterations can be induced by exposure to environmental chemicals or as side effects of drugs and thus have implications for human health. In addition, drugs that target specific epigenetic mechanisms are being developed to treat human disease, including several that have been approved for use.

A maniscript arose out of the meeting and new collaborations were pursued.
Year(s) Of Engagement Activity 2013
URL http://www.hesiglobal.org/i4a/pages/index.cfm?pageID=3639
 
Description The Bioprocessing Summit Boston 2015 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented our work on Epigenetics and cell identity to an mainly industrial audience:

About the Event

The Bioprocessing Summit brings together international leaders to discuss today's bioprocess issues from cell line selection to bioproduction. The Summit provides practical details in a relaxed, congenial atmosphere that promotes information exchange and networking. This leading bioprocess meeting is hosted in Boston each summer along the lively and cosmopolitan harbor waterfront. Each year, the international bioprocessing community comes together at the Summit to share practical solutions for today's bioprocess challenges with researchers from around the world. Spanning five days, the 2016 meeting includes 16 conference programs, 9 training seminars, and 10 short courses.
Year(s) Of Engagement Activity 2015
URL http://www.bioprocessingsummit.com/