Nucleosome positioning and transcriptional regulation in Drosophila differentiated cells

Lead Research Organisation: Cardiff University
Department Name: School of Biosciences

Abstract

Each cell in a multicellular animal expresses (transcribes) only about half of all its genes. Some genes are transcribed in all cells, some in a subset of cell-types, while some are transcribed only in a single cell type. Correct gene expression in cells, both turning on and turning off genes in development and differentiation, is essential for normal cellular function, for normal organism development, and for lifelong health. The gene expression repertoire of each individual cell is determined by its identity, and is established during development and cellular differentiation. 10-20% of all the genes in the genome are transcribed exclusively in testes, in cells destined to differentiate into sperm; 1-2% of all genes are transcribed exclusively in muscle cells.

Regulation of gene transcription involves several processes working in concert - notably DNA packaging and sequence-specific DNA-binding factors, which then recruit the transcription machinery. The DNA in the cell is packaged with proteins to form chromatin. The fundamental packing unit is a nucleosome - protein complexes around which the DNA strand is wrapped, forming a beads-on-a-string structure. The precise position of each nucleosome is not determined by the DNA sequence directly and varies between cell-types. On average, in most studied cell-types, nucleosomes are found in a canonical pattern positioned around genes, particularly in the region of DNA where transcription of the gene starts (the transcription start site, TSS). This positioning of nucleosomes is thought important for allowing access of proteins that bind specific DNA sequences and regulate transcription of the adjacent gene. On average, genes that have higher expression levels have more accurately positioned nucleosomes. However, the studies that have led to this view have used tissue culture cells or mixed cell populations. Differentiated cells have not been examined.

We have extensive preliminary data to indicate that many genes transcribed at extremely high levels in spermatocytes, the cells that give rise to sperm, do not follow these established rules regarding nucleosome positioning. These data cast doubt on the conclusion that high gene expression and canonical nucleosome positioning are necessarily linked. They also raise the question of how spermatocytes achieve a high level of transcription of these genes, and whether other cell-types also deviate from established rules. These genes rely on a specific transcription factor complex, termed testis meiotic arrest complex (TMAC); we have found that nucleosome positioning at TMAC-target TSSs is altered when this complex is mutated.

In this project we have three main aims
1) To characterise the nucleosome positioning at TSSs in spermatogonia, the precursors of spermatocytes to reveal how the positioning of nucleosomes changes as genes are turned on and off in this lineage. We will further determine the effect of loss of specific nucleosome positioning factors and transcriptional regulators at these sites. We will map the binding sites of TMAC and other regulators across the genome in these cells.

2) To examine whether the unusual pattern of nucleosome position we see for testis-specific gene expression is also seen in other cell-type specific gene expression programmes, or whether the canonical pattern is used in this situation. In either case we will determine how the pattern changes in differentiation. We will analyse nucleosome position around TSSs in adult muscle cells, and their immediate precursor cells.

3) To understand the molecular details of how high levels of gene expression are driven in the absence of canonical nucleosome positioning at TSSs in spermatocytes we will characterise the DNA sequences responsible for the testis-specific expression. We will determine how they interact with TMAC and how they influence local nucleosome position in their normal location and when we move them to a new position in the genome.

Technical Summary

Transcriptional regulation of gene expression is essential for normal cellular function and differentiation. Some genes are expressed in all cell types, while expression of others is lineage restricted. Sperm formation requires expression of a large number of genes (>1500) that are expressed exclusively in the male germline. Regulation of gene expression involves interactions between the local DNA sequence, the general transcriptional machinery, sequence specific DNA-binding factors, and the local chromatin structure (nucleosome positioning and modification). We have extensive preliminary data describing the chromatin conformation at transcription start sites (TSSs) active in normal and transcription factor mutant spermatocytes. TSSs of broadly expressed genes, which are typically active early in spermatocyte differentiation, follow the canonical nucleosome positioning profile of a nucleosome free region at the TSS flanked by well-positioned nucleosomes. However, strikingly, the positioning of nucleosomes at testis-specific TSSs does not conform to this pattern. While there is a broad nucleosome depleted region at the TSS, there are no positioned nucleosomes either up- or downstream. This striking finding challenges the text-book view that canonical nucleosome positioning and high levels of gene expression are tightly coupled. We will conduct a series of further experiments to fully document this chromatin architecture in the Drosophila male germline, including studies in transcription factor and nucleosome remodelling factor mutant spermatocytes, and studies to determine binding sites of various critical regulators. We will determine whether the unusual nucleosome positioning at testis gene TSSs is unique, or also found in other tissue specific contexts. An extensive mechanistic study will determine how the sequences at testis-specific TSSs interact with regulatory factors and regulate or impose nucleosome positioning on flanking sequences.

Planned Impact

Beneficiaries
This project is aimed at addressing fundamental biology questions, and furthering knowledge in the field of gene expression regulation. It is therefore not expected to have immediate translational impacts. Thus the major beneficiaries will be in the academic sector, as detailed above. The project uses state of the art methodology, and there are likely to be greater impacts in increasing the availability and thus exploitation of these methods. The work will also provide training for two researchers in high need areas, particularly bioinformatics. The researchers will not only be able to apply bioinformatics tools, but will have been instrumental in developing new tools.

Technology advancement:
NAK developed the specific MNase seq protocol that we used for the preliminary experiments and that we will use extensively in this project, in collaboration with Konrad Paszkiewicz (Exeter University), and published this in 2011. The method requires modifications to the standard Illumina library preparation method, and NAK has now optimized the use of this in the NeoPrep library preparation robot. We envisage that with further development and dissemination and publication, other Illumina sequencing centres will adopt the modified sample preparation pipeline to deliver this methodology to other groups. Success in this project, coupled with our dissemination activities, will increase uptake of this method by other researchers in the UK.

Staff training.
A major impact of our research will be the training of the two staff. The post-doctoral researcher will develop extensive molecular biology and genetics lab skills. S/he will also gain skills in the manipulation of big data, the design, writing and execution of scripts to query the data, and the presentation of highly complex results in visually compelling formats. Bioinformatics and coding have been identified as skills gaps in the UK bioscience community. The technician will also gain a wide variety of molecular biology and cell biology skills, again these are in high demand in the sector. They will also have the opportunity to learn bioinformatics, especially in the second half of the project. Both the researchers will develop their communication skills - to academic peers via presentation at conferences, and to the general public via outreach events.

Impact outside the academic sector.
Our research project is one aspect of the fundamental question "what makes one cell different from another". This is of broad interest to the public, and we will continue our outreach work with both schools and to adult audiences, to increase the general public understanding of this science.
 
Description The work is not yet published, but we have so far generated several large datasets revealing gene expression patterns in Drosophila testes, and the associated chromatin structures at around the expressed genes. This has been coupled with a parallel analysis of gene expression patterns and chromatin structure in Drosophila muscles. This is feeding into a large scale analysis, and we anticipate that the results will overturn the conventional view of the role of chromatin in gene expression. The work provides experimental evidence in Drosophila for a clear link between type of gene (tissue specific vs ubiquitously expressed) and chromatin architecture rather than gene expression level being a major driver. The work has now been completed and analyses are in the final stages of being prepared for publication.
Exploitation Route The high throughput sequencing data will be shared on publication, and has the potential to feed into others' work through re-analysis in different contexts.
Sectors Other