Mining the allohexaploid wheat genome for useful sequence polymorphisms

Lead Research Organisation: University of Liverpool
Department Name: Sch of Biological Sciences


Bread wheat is of fundamental importance to UK, European and world agriculture, with an estimated 2007 world harvest of ~ 550 m tonnes. In the UK, ~1.8 m hectares are planted with wheat, yielding ~7.2 tonnes per hectare, with a farm-gate value of £2.6 billion. The UK has ideal growth conditions for wheat and has a world-class crop improvement programme. Despite its importance, wheat production world-wide has not kept pace with increased demand, and productivity is threatened by disease, increased fertiliser costs, competition for high quality agricultural land, resource limitations, and adverse environmental conditions that dramatically reduce optimal yields. It has been estimated that in Europe productivity has to be doubled to keep pace with demand and to maintain stable prices. Therefore by narrowing the gap between maximal yields and actual yields, and increasing maximal potential yields, sustainable and adequate production of one of the world's most importance crops could be secured. The large increases in wheat yield have been primarily due to genetic improvements brought about by selective breeding of elite lines. The power of breeding can be increased by enabling the incorporation of wider genetic diversity and accelerating the identification of best-performing genotypes. This can be achieved using DNA sequence markers to identify genetic diversity underlying key traits. We aim to use next generation sequencing and a novel computational and comparative genomics strategy to identify sequence differences in the genomes of 5 key varieties that can be used to define different versions of a single gene in different varieties. Finding this type of marker in wheat has been problematic in the past because wheat is a hexaploid, with potentially 3 copies of each gene, and most of the sequence differences in wheat lines are between these three copies of a gene in a variety, rather than between genes in different varieties. With this information and a set of markers, breeding companies and academic scientists will be able to identify and select specific regions of the genomes of different varieties, and use this information to isolate genes and select lines with that region of DNA in it from crosses. This capability will fundamentally alter wheat research by enabling the use of more diverse lines in breeding, including wild species that have a wealth of under-exploited traits, including stress tolerance. Finally this genotyping study will facilitate a far greater level of academic research in a key UK crop. The sequencing and informatics strategies we aim to develop will also establish ways to sequence the complete genome of wheat. Currently the large size of the genome, its hexaploid composition and predominant repeat composition, is a large barrier to progress. However, the high throughput and low cost of next generation sequencing provides a solution to the scale of the wheat genome. Our proposed work will enable sequencing to focus on gene-rich regions and increase the potential for assembling gene-rich genome sequences. Furthermore, using a novel bioinformatics strategy that uses the complete genome sequence of a closely-related species as a 'template' for identifying both gene structures such as introns and an approximate order of genes, our work will define new ways of assembling gene sequences and the order of genes in wheat chromosomes. This will lower the barriers for future work aimed at larger-scale genome sequencing and analysis. Finally this project is closely linked to the UK breeding community through WGIN, to academic laboratories studying wheat in the UK through the Monogram Network, and to the international wheat genomics community through the International Wheat Genome Sequencing Consortium. This will ensure the rapid transfer of information to key stakeholders.

Technical Summary

The 16 Gb hexaploid genome of bread wheat is among one of the largest and most complex genomes, and its importance as a primary food crop demands projects that will generate useful genome sequence. The genomes of grasses vary greatly in size due to the expansion of repeats, primarily retroelements, while the order of genes is remarkably conserved in large chromosomal segments. This leads to large tracts of repeat, which are heavily methylated, interspersed with small groups of genes which have much lower levels of methylation. These features form the basis of a novel strategy to sequence the gene-rich regions of the wheat genome with three complementary approaches; methyl filtration, high Cot normalisation, and gene enrichment by hybridisation, using next-generation sequencing. Sequence from several key breeding lines will identify sequence polymorphisms in 5', 3' and intron regions of genes, where sequence diversity is much higher than in coding regions. Genome-scale alignments of wheat genome sequence with wheat transcriptome sequence assemblies, produced by 454-FLX sequencing and aligned with gene models established in the genome sequence of Brachypodium distachyon, form the template for sequence analysis. This alignment strategy is straightforward and uses existing computational resources and skills available in the partner labs. Sequence polymorphisms will be identified using existing software developed by Bristol and classified bionformatically into intra-varietal and inter-varietal sequence polymorphisms. The polymorphisms will be validated and used to screen a core set of UK and, in collaboration, a wider variety of Australian germplasm. The project is linked to the International Wheat Genome Sequencing Consortium. It forms part of the UK Wheat Genetic Improvement network that links breeders, scientists and funding agencies. The project will be coordinated within the Monogram Network of wheat scientists, ensuring links to all UK wheat researchers.
Description We have sequennce the entire genome of bread wheat. this analysis helps breeders improve wheat and helps researchers identify genes involved in useful traits.
Exploitation Route The SNP data generated are being used by wheat breeders. The scientific community are using the data for gene identification.
Sectors Agriculture, Food and Drink

Description the SNP generated are used by breeders in breading programes to improve wheat varieties.
First Year Of Impact 2011
Sector Agriculture, Food and Drink
Impact Types Societal,Policy & public services

Description ERA CAPS research grant
Amount £423,000 (GBP)
Funding ID BB/N005104/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 01/2019
Description collaboration with Meyer Group at Helmholtz Munich 
Organisation Helmholtz Zentrum München
Country Germany 
Sector Academic/University 
PI Contribution We provided the sequence data. Both groups undertook assembly strategies that were published alongside each other
Collaborator Contribution The Helmholtzz group provide annotation and analysis which were published in the paper.
Impact Joint grants including ERA-CAPs BBSRC grant. Co publication of the wheat methylome. Ongoing collaboration through the wheat genomics community on annotation and analysis
Start Year 2011
Description Feature article on genomics in the Easton Daily Press 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact The Article was to cover the research activity at the Earlham Institute and at the Norwich Research park and how it would impact the general public
Year(s) Of Engagement Activity 2019
Description School visits 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact Numerous school visits to the CGR
Year(s) Of Engagement Activity 2015,2016