Design optimisation and validation of high density microarrays for multiple Escherichia coli genomes

Lead Research Organisation: University of Birmingham
Department Name: Sch of Biosciences


Since the first complete DNA (genome) sequence for a free-living organism was determined for the bacterium Haemophilus influenzae in 1995, a wealth of additional genomes have now been analysed. They include about 40 from different Escherichia coli strains, creating the widest ranging genomic data set available for any species. These data now enable dissection, at the level of expression and its control, for every one of the 5000 or so genes found in E. coli. Genome sequences are available for strains ranging from important foodborne pathogens to vehicles for expression of medically important proteins for treatment of disease. New methods have been developed for exploitation and analysis of sequence data, with the generic name 'functional genomics'. This means analyses that are based on 'massively parallel' experimental designs where expression of each individual gene in the genome is analysed simultaneously using an array of specific detection probes, one for each gene, set out on a solid surface. Until recently, this approach was available for studies of transcription (formation of messenger RNA prior to production of the protein encoded by that gene) but had not been developed further. Now, colleagues in this School have worked out ways to use probe arrays to detect the locations of specific proteins called transcription factors. These factors bind to the chromosome to regulate the transcription of specific genes by facilitating or obstructing the enzymes involved. This new technology depends critically on technical advances in array fabrication, and our collaborators at Oxford Gene Technology (OGT) are leaders in this field. Arrays used previously are made by 'spotting' a solution of DNA in the form of a short stretch of known sequence onto a glass slide, one spot for each gene to act as a probe for the complementary sequence present in mRNA. Now, methods have been developed by OGT whereby these probes are synthesized on the slide in situ, using inkjet printer technology to deliver tiny spots of chemical reagent to each 'feature' or area of the surface that will contain a given probe, to determine the unique order of nucleotides forming that probe. The new method allows for dramatically greater numbers of probes and reduced costs per probe compared with the spotting of pre-synthesised solutions. This means there are enough probes to cover the whole genome, and locations of bound transcription factors can be identified clearly by using the array to detect fragments of chromosome bound to the protein when it is recovered and purified from bacteria. To exploit the availability of numerous new E. coli genome sequences and the technical ability to analyse their bound transcription factors at high resolution, we propose to design and validate, in collaboration with OGT, a new generation of high resolution microarrays. These will be optimized for both gene transcription studies and transcription factor binding (ChIP-on-chip) studies. The latter method is named after the first step to purify transcription factor protein bound to DNA: the protein DNA complex is called chromatin (abbreviated Ch); IP represents ImmunoPrecipitation, a method incorporating a specific antibody to recover just the one protein of interest from whole cell homogenates. 'Chip' represents the microarray (sometimes referred to as a 'chip' in analogy to IT terminology) used to analyse the DNA fragments bound to the immunoprecipitated protein. The work we propose will make this extremely powerful technical approach widely available to the research community and will underpin important advances in understanding of the ways E. coli controls expression of its genes, whether acting as a pathogen or being exploited in industrial processes. An exciting spin-off from the work will be unique and informative data comparing the patterns of gene expression and RNA polymerase binding throughout the genomes of 8 different strains of E. coli.

Technical Summary

The proposal builds on our successful 'Exploiting Genomics' (ExGen) programme to ensure that next generation microarray technology and expertise remains available and accessible to the research and applications communities. Spotted oligonucleotide microarrays have been a key resource that has enabled major progress to be made in functional genomics of Escherichia coli in our ExGen programme, but it is now clear that they have peaked in their development and applications. Technical advances and falling costs in in situ fabrication of microarrays provide unparalleled flexibility and resolution that cannot be matched by oligo printing. We propose to build on work pioneered by the Busby group and ExGen team at U of B in collaboration with Oxford Gene Technology (OGT), to develop new, versatile and broadly applicable next-generation arrays for both expression and ChIP-on-chip studies. The latter is a powerful method for determination of transcription factor binding to the genome, that beautifully complements transcriptomics data. Escherichia coli, the most thoroughly studied model bacterium, is a particularly apposite organism for exploitation of this new technology. It is by far the front runner in comparative genomics, with about 40 whole genome sequences revealing an extraordinary and unexpected genomic diversity and genetic plasticity. And, it remains the organism of choice for the bulk of recombinant protein production in the biotech and biopharmaceuticals industries, where opportunities abound for application of this technology to optimise performance and productivity. We thus propose to enable 'rapid exploitation of the very latest cutting edge technology' (scope of the initiative) by developing next generation microarray methodology to support the academic, public sector and industrial communities in the UK. In so doing we will also generate an exciting and novel data set in comparative genomics of gene expression and transcription factor binding.


10 25 50