Reading the genome: how do transcription factors achieve target specificity?

Lead Research Organisation: University of Cambridge
Department Name: Physiology Development and Neuroscience


Animal development and the normal physiological responses of the body rely upon the precise control of the activity of genes within the genome, a process controlled by regulatory proteins known as transcription factors. How transcription factors identify the genes they control by binding to specific DNA sequences in the nucleus remains a considerable challenge for biologists. We know that information dictating where and when a gene is switched on is encoded in the DNA sequence; however, we do not yet understand this regulatory code. One group of regulatory proteins known to control multiple aspects of development, the Hox family, includes a set of closely related factors found in all animals, from flies to humans. Our current understanding is that specific Hox proteins are responsible for controlling the structures that are produced along the body axis, such as where arms, legs or ribs develop. Mutations in Hox genes have dramatic consequences on the way animals look, for example the loss of the Ultrabithorax gene in the fly can result in the production of a four wing rather than a normal two wing fly and Hox mutations in humans can result in limb abnormalities such as the development of extra digits.

While it is clear from genetics that individual Hox proteins have very different effects on development, paradoxically all Hox proteins are very similar and appear to recognise virtually identical DNA sequences that are believed to dictate the sets of genes they control. This general problem of specificity is common to most families of transcription factors. Some of the specificity may come from specific interactions with co-factor DNA binding proteins and we will explore this. To understand more about how the important class of Hox proteins are able to control specific sets of genes we use the fruit fly as a model system. The Hox proteins of the fly are closely related to those in humans but the fly genome is 20 times smaller, making analysis much easier. We can test Hox gene interactions with DNA, and the functional consequences of binding, by expressing Hox proteins under controlled conditions in a defined fly cell culture system. Understanding how Hox genes recognise specific genomic sequences and control genes is important for our basic understanding of how all animals develop and it is also important if we wish to gain insights into how evolution has produced the huge range of body plans we see around us. Our experiments will use a cell culture system to determine where all core Hox proteins bind in the genome and the effects they have on gene expression with and without their key DNA binding cofactors. We will then relate this binding to general features of the DNA in the nucleus, essentially how available a stretch of DNA sequence is for binding, to further understand the rules that determine where Hox proteins bind. This will help us to determine how these very similar proteins give rise to different regulatory outcomes. We will use these data to better understand the rules by which Hox proteins are able to recognise the DNA sequences that control genes, and discover differences for each of the Hox proteins. Since the Hox genes of mammals are organised and operate in the same way as those in flies, our work will help us understand how Hox genes control development of higher animals, including man. In addition, it has recently become clear that Hox genes are involved in several diseases, including cancers, thus a better understanding of how Hox genes work may, in the future, be useful when studying aspects of human disease. Since most families of gene regulators show similar properties to Hox proteins, our studies will also help address the more general issue of how genes are specifically controlled by sets of similar regulators. By studying the basis for this specificity, we will be able to make progress in understanding the regulatory code of the genome.

Technical Summary

How transcription factors achieve target specificity is a major issue for understanding the control of gene expression and the interpretation of regulatory DNA in the genome. There are two aspects; how specific chromatin regions are made accessible for binding and how transcription factors identify specific sequences. Most transcription factors exhibit limited sequence specificity and their in vitro defined binding motifs occur far too frequently in the genome to account for their functional specificity. We will use the Hox family of homeodomain transcription factors as a example of the above puzzle since each family member exhibits clear in vivo functional specificity but in vitro all paralogues bind to very similar sequences. Although cooperative binding with "cofactors" offers a potential solution, in the few in vivo contexts studied, the role of cofactors in Hox functional specificity is still unclear. We are faced with a wealth of in vitro binding data that extrapolates poorly to the in vivo situation and we believe that a systematic in vivo approach is required. For this we have established an in vivo Drosophila cell culture model and will use ChIP-Seq to systematically investigate Hox binding in the absence and presence of cofactors. This will identify the sequences that underlie in vivo Hox specificity. It is already clear that chromatin accessibility plays a major role in Hox targeting and we will investigate the sequence basis for this. Further, we will use digital genomic footprinting to establish bound sequences at the nucleotide level. Finally we will associate the binding data with the identification of functional sites on the basis of gene expression regulation. Overall these studies will enable us to determine the underlying sequence basis for Hox functional specificity that will serve as a paradigm for the general understanding of transcription factor target specificity, developing our ability to interpret regulatory sequences in the genome.

Planned Impact

Outside of our immediate professional circle and the wider academic community described above, we believe there will be two groups of beneficiaries in the medium to long term:
Small biotech/cancer research: over the past decade, there is increasing evidence implicating a variety of Hox genes in aspects of cancer biology, particularly in lymphoid, prostate, kidney and skin malignancies. For example, Hox-Pbx anticancer peptides are a focus for drug development in myeloid leukaemia and HXR9 is a Hox antagonist being developed to inhibit Hox expressing cancer cell lines. In both of these examples, drugs targeting general aspects of Hox function are likely to be better tailored if we have a better understanding of how specific Hox proteins identify and regulate target genes. We believe our studies defining some of the rules governing Hox and Hox-Pbx DNA interactions will be valuable baseline data for these types of drug discovery efforts. We recognise that this is likely to be a longer term benefit, however, given that Hox targeting therapeutics are currently being developed there is the potential for more immediate impact in this arena. We will deliver this potential impact by the usual route of peer-reviewed publication and conference presentation. Any published work will be flagged to our Press office for possible press release.
Supporting the need for a trained workforce, we will deliver training in the BBSRC priority area of "new ways of working" to the researchers and any graduate students associated with this work. This will make a contribution to increasing the UK skill base in modern genomics. In particular, handling genome-scale data requires a considerable degree of IT skills and an understanding of quantitative methods. Training in these areas will be provided in the course of this project and will be general enough to be applicable also in other work areas. This impact will be delivered over the course of the funding period.
More generally, we believe our international reputation in the genomics arena is of wider benefit to UK science. Our previous impacts in this area have included increased international collaborations, including participating in major international efforts such as modENCODE. Maintaining a UK profile in modern bioscience research is important for attracting research to the UK. Our publications and presentations will continue to provide these impact benefits over the course of the grant.
The public: our experience from presenting work at Science Fairs indicate that there is considerable public interest in the "hopeful monsters" that are exemplified by the homeotic transformations exhibited by Hox mutations. There is also much interest in how the sequence of the genome can be decoded as a set of genetic instructions. Our efforts to uncover how Hox genes function and delivering insights from this work at University open events will progress throughout the period of the funding and continue beyond.


10 25 50
Description 1. We have identified that regions of constitutive gene activity act as boundaries to delimit the borders of chromatin domains and thus play a role in organizing the packaging of the genome. 2. We have mapped the binding sites of the full set of Hox proteins in Drosophila and identified that chromatin accessibility plays a key role in the specific target selection by different members of the Hox transcription factor family
Exploitation Route 1. Our analysis of genome architecture supports a two-state model with a fundamental separation of constitutively-active and developmental-related genomic regions. This contributes to understanding the regulatory genome and impacts on the interpretation of the effects of mutations that lead to genetic diseases. 2. Our analysis of Hox transcription factor binding provides a paradigm for how transcription factors interact with chromatin to achieve specificity. This contributes to understanding the regulatory genome and impacts on the interpretation of the effects of mutations that lead to genetic diseases.
Sectors Healthcare

Description Imaging functional chromatin architecture in Drosophila
Amount £382,402 (GBP)
Funding ID BB/S00758X/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2019 
End 12/2021
Description Poster presentation at EMBL Conference "Transcription and Chromatin", Heidelberg 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented a poster entitled "Regions of very low H3K27me3 partition the Drosophila genome into topological domains" to participants at the EMBL conference on "Transcription and Chromatin". This is one of the key meetings in the field and provided an important opportunity to present our work to an international audience.
Year(s) Of Engagement Activity 2016
Description Talk at Teachers Conference Galton Institute,Manchester 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact The Galton Institute runs annual 1-day meetings to uptake sixth-form Biology teachers on recent advances in genetics. I was invited to give a talk on Hox genes and gene regulation followed by discussion. The interaction with the teachers was very good and the feedback was that the session was very successful.
Year(s) Of Engagement Activity 2017
Description Talk in Developmental Biology Seminar Series, Cambridge 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Talk at Developmental Biology Seminar Series. This is a long standing seminar series in Cambridge bringing together scientists and students with an interest in developmental biology. It is important for spreading awareness of our research, networking, and in this case resulted in a link with another lab studying chromatin accessibility enabling us to share experience with different methods.
Year(s) Of Engagement Activity 2016