Testis-specific activation of gene expression

Lead Research Organisation: CARDIFF UNIVERSITY
Department Name: School of Biosciences

Abstract

Each cell in a multicellular animal expresses (transcribes) a subset of all the available genes. Some genes are transcribed in all cells, some in a subset of cell types, while some are transcribed only in a single cell type. Correct gene expression in cells, both turning on and turning off genes in development and differentiation, is essential for normal cellular function, for normal organism development, and for lifelong health. The gene expression repertoire of each individual cell is determined by the cell's identity, and is established during development and cellular differentiation. Sperm are extremely specialised cells, capable of performing several unique biological functions including fertilisation of the egg. In humans, and most other animals, 10-20% of all the genes in the genome are transcribed exclusively in testes, in cells destined to differentiate into sperm, and this expression is critical for the formation and function of sperm, and thus for normal reproduction. Surprisingly, very little is known about the regulatory elements that drive testis-specific gene expression, despite how critically important this is for fertility.

In this project we will use the fruit fly, Drosophila melanogaster, to investigate how transcription of testis-specifically expressed genes is activated. We have previously identified a set of proteins that work together as a complex to bind to DNA at the regulatory regions of target genes and activate their expression. Four of the proteins within this complex contain regions that we predict bind directly to DNA. We know that only a very short region of DNA sequence at the start site of testis-specific genes is needed for them to be active. However, the actual DNA sequences of these short regions bear little similarity with one another, when comparing between the over 1000 genes regulated by this complex. How is it that this complex recognises apparently very different stretches of DNA, binds to them, and causes activation of the adjacent gene, while ignoring other DNA sites?

We will answer this question using a set of independent experimental approaches coupled with interlinked and integrative computational analyses.
1) We will determine the DNA sequence binding preference for each of the DNA-binding proteins in our complex. This will allow us to look at where these sequences are found in the whole genome - we expect them to be enriched at the positions of testis-expressed genes.
2) We will identify all the sites in the genome at which each of complex proteins are found in cells in the testis - we expect them to be enriched at the positions of testis-expressed genes.
3) We will make mutant versions of the proteins that can still contribute to the complex but that can no longer bind DNA, and then identify all the sites in the genome at which these mutant proteins are found in cells in the testis, and what effect this has on gene expression - we expect binding at some sites to be lost, and that the genes associated with these sites will no longer be expressed.
4) The results of these experiments will generate a clearer understanding of the biological rules that determine testis-specific gene expression in response to our protein complex. To determine whether we actually understand these rules we will test the activity of a set of mutant versions of gene control regions that we predict should, or should not, activate gene expression.
5) Finally, newly evolving genes are often expressed exclusively in testes, and their expression requires our regulatory protein complex. How do these genes gain this expression pattern? We will begin to test this by directing our complex to different regions of the genome and evaluating whether targeting it to a particular region is sufficient to activate expression of nearby DNA sequences.

Technical Summary

Transcriptional regulation of gene expression is essential for normal cellular function and differentiation. Some genes are expressed ubiquitously, while expression of others is lineage restricted. Sperm formation requires expression of >1500 genes that are expressed exclusively in the male germline. Activation of the majority of these genes in Drosophila testes involves interaction between a protein complex containing four sequence-specific DNA-binding factors and a short region (100-200bp) of promoter DNA sequence at the target gene transcription start site. Despite the large number of target genes, and short minimal sequence required, we still don't understand the DNA sequence requirements for testis gene expression.
We will apply the powerful genetic and molecular biology tools available in Drosophila to investigate how a complex of proteins containing several DNA-binding subunits recognizes specific promoters with very different underlying sequences, and activates gene expression.
We will determine the in vitro binding preference for each of the DNA binding proteins in the complex by SELEX-seq. We will use ChIP-exo of tagged proteins to reveal the precise in vivo location of bound sites, and to determine whether there are site-specific variations in complex composition. To identify the requirement for DNA binding activity of each factor at each genomic site we will generate mutant versions of the DNA binding proteins that can still interact with protein binding partners but cannot bind DNA and determine their location by ChIP-exo. RNA-seq will reveal the effect of these mutations on transcription. Our integrated bioinformatics analysis of these data will generate predictions for how individual testis-specific promoters respond to the complex, and we will test these predictions with a set of reporter constructs. Finally, we will determine whether targeting the complex to ectopic genomic locations is sufficient to drive testis-specific expression.

Planned Impact

Beneficiaries
This project is aimed at addressing fundamental biology questions, and furthering knowledge in the field of gene expression regulation. It is therefore not expected to have immediate translational impacts. Thus the major beneficiaries will be in the academic sector, as detailed above.

Staff training.
A major impact of our research will be the career development of the associated staff. The post-doctoral researcher will develop extensive molecular biology and genetics lab skills. S/he will also gain skills in the manipulation of big data, the design, writing and execution of scripts to query the data, and the presentation of highly complex results in visually compelling formats. Bioinformatics and coding have been identified as skills gaps in the UK bioscience community, and the bioinformatician will continue to develop his expertise by taking on this new project. The technician will gain a variety of molecular biology and genetics skills, and will develop experience working with genome-editing approaches, again these are in high demand in the sector. The researcher will develop their communication skills - to academic peers via presentation at conferences, and to the general public via outreach events. They will develop line management skills (with mentorship from the PI), by taking on day-to-day supervision of the technician. This will equip the PDRA with a highly desirable skill set based on cutting edge technique application, computational analysis and soft skills that they can apply in further career settings either within or outside academia.

Impact outside the academic sector.
Our research project is one aspect of the fundamental question "what makes one cell different from another". This is basic scientific research, but underpins normal development, health and disease. We will continue our outreach work with both schools and to adult audiences, to increase the general public understanding of this science. Our focus on male fertility and sperm production ensures that the audience finds the topic interesting; everyone is interested in sex although not everyone is comfortable talking about it. Talking to the public about fly sperm breaks down barriers and acts as a way to encourage the audience to think about biology and how life works. Explaining why basic biology is important, and why work in insects can increase our understanding of human health and disease, is a key target.
 
Description This project aims to identify how DNA sequences activate gene expression specifically in testes. We had previously identified a complex of proteins that activates genes in testes, by binding to the DNA at the start of the gene sequence. Four proteins in this complex have the ability to bind DNA, so it was not clear which ones were important at different target genes. We have identified the sequences that each of these DNA binding proteins prefer to associate this, and are using this information to predict binding at specific sites. We are testing whether we lose the gene expression if we remove the sequences we think are important.
Exploitation Route This research can feed into functional models of transcriptional activation. The findings will also be useful for evolutionary biologists investigating how gene expression patterns are established and maintained.
Sectors Education