Genomics Enhanced Wheat Breeding: Using sequencing technologies for trait dissection, marker assisted selection and genomic selection in wheat.

Lead Research Organisation: RAGT Seeds (United Kingdom)
Department Name: Research and Development Icketon

Abstract

Bread wheat accounts for a fifth of the world's food, is the main source of protein in developing countries and is second only to rice as a source of calories in those consumers' diets. It is the most widely grown arable crop in the UK, where it is grown on around 1.8 million hectares per year.

UK breeders and farmers have been highly successful in developing and growing wheat varieties with higher yield potential: over the period 1948 to 2006, average yields in the UK increased from ~3 tonnes per hectare to ~8.0 tonnes. Unfortunately, wheat production increases have not kept pace with increased demand. Furthermore, wheat productivity is threatened by disease, competition for high quality agricultural land, resource limitations, and adverse environmental conditions that dramatically reduce optimal yields. It has been estimated that in Europe productivity needs to double to keep pace with demand and to maintain stable prices.

To help plant breeders improve wheat varieties they have utilised genetics variants, in particular Single Nucleotide Polymorphisms (SNPs), that are linked to known traits such as disease resistance, adaptation to particular environments, bread making quality characteristics and components of yield. Breeders use molecular markers to track linked SNPs as a proxy for these beneficial traits. This has the advantage of screening many thousands of lines quicker, cheaper, and in some cases more accurately than growing in a field to assess the lines conventionally.

Wheat evolved from two naturally occurring separate hybridisation events, each creating a genetic bottleneck. Firstly two wild grasses hybridised to form a relative of pasta wheat. Subsequently a third wild grass hybridised to produce bread wheat. As a result, the total genome size is approximately 16,000 Mb or 35 times the size of the rice genome and 5 times the human genome. The relative lack of diversity in bread wheat has led to breeders and researchers crossing wheat varieties with relatives of wheat to increase genetic diversity and specifically to introduce beneficial traits such as disease resistance. Whilst a large number of SNPs have been identified in bread wheat, these have been identified in relatively few varieties and are not always relevant to the particular germplasm that a breeder is working in. In addition, many of the relatives that have been crossed into wheat have not had specific SNPs identified and represent a 'blind spot' to the breeders.

This project aims to produce DNA sequence data for wheat genes for 280 wheat varieties, using a technology known as exome capture and 'next generation sequencing'. This is a complexity reduction process, which reduces the genome size and will enable us to produce sequence data for a large number of lines. The lines will be selected to contain key relatives that have been crossed into wheat. This data will enable us to identify the regions that we have been blind to, and to more accurately locate genes of interest so we can then breed for them using molecular markers. Ultimately, this will help us to develop varieties that have key traits for farmers, such as virus resistances, that will enable them to use less pesticides and to farm wheat more reliably.

Planned Impact

Wheat is the UK's major crop and has the 3rd largest production of any cereal globally. This project has the potential to benefit individuals and organisations worldwide for whom improvement in wheat is important. This ranges from farmers, through millers and bakers, to anyone buying bread in their local supermarket.

The impact of the methods we develop will be seen first by the UK wheat varieties targeted in this project, with delivery of improved RAGT varieties to market in the following few years. This will be closely followed by improvements in the RAGT French and German breeding programmes. Whilst the first beneficiary of this research will be RAGT, over time, all wheat breeders will benefit from these improvements as they are able to cross with any national listed varieties through the plant breeders' exemption to Plant Variety Rights and hence will gain access to beneficial combinations of alleles that are achieved through the research undertaken in this proposal. Ultimately this will lead to genetic improvement in wheat varieties as a whole, which will benefit all farmers and consumers who consume the wheat. To provide a specific example, it is anticipated that sequence information for the Bdv2 BYDV resistance introgression in RGT Wolverine will be used to fine-map the resistance gene and to enable efficient marker-assisted selection. This will assist RAGT to breed more varieties with this gene combined with other beneficial alleles. These varieties will allow farmers to grow wheat with fewer or no insecticide treatments. This will have both an environmental and an economic benefit due to the lower inputs. This benefits consumers both through reduced pesticide residues in food and the societal and environmental benefits of reduced effects of agro-chemicals on wildlife food-chains.

One component of the project is the establishment of bioinformatics pipelines within RAGT. This will be conducted by the fellow working with the ERA bioinformatics service company. This will provide a future platform for the cereal genotyping team to utilise publically available wheat sequence datasets and to integrate them with the datasets produced in the current proposal using the current wheat reference genomes and future genomes assembled at high quality from the 10+ Genome Project. This will impact directly on the RAGT molecular wheat breeding programme and also will provide opportunities for further collaborative research with academics utilising sequencing data.

A further beneficiary will be the research of the academic project partner, Prof. Cristobal Uauy. As part of the project ~20 lines to be nominated by Prof. Uauy will be included in the exome capture and sequencing. The intellectual property of this data will reside with the John Innes Centre (JIC) and can be published freely to provide benefits to the wheat research community as a whole. It is anticipated that the project, and the data generated within it, will lead to further possibilities for research collaborations either bilaterally between RAGT and JIC or as part of wider research collaborations between the plant breeding industry and JIC.

Finally, the fellow will be a beneficiary as it provides a unique opportunity to translate methods that have to date been limited to application in academic research, in which he has considerable expertise, to a practical breeding programme to provide more rapid genetic improvement of varieties. He will benefit from the expertise of Prof. Uauy and his research group in the application of genomic technologies to wheat genetics studies. This will provide him with a combination of both specific research skills and transferable skills that will make him well placed to become a global leader in the field of molecular plant breeding.

Publications

10 25 50
 
Description The award has provided the wheat breeding community a pipeline to convert sequence data into markers that can be implemented in a breeding programme. The ability to design and test low cost markers will transform how breeding companies will be able to carry our routine marker assisted selection of their breeding materials. The pipelines and bioinformatic tools are a valuable resource, which will greatly improve the efficiency for any wheat breeding programme.
Exploitation Route The outcomes have allowed both academics and industry to start exploring the use of short read sequences for haplotype definition in wheat. The data will also allow wheat breeders to identify introgressions and new markers to track traits in their breeding programmes. The project has already led to the development of collaborations with other private companies to develop tools to analyse and visualise such dense data sets.
Sectors Agriculture, Food and Drink,Education

 
Description This research has revealed the need to have robust bioinformatics tools that can handle the massive amounts of data generated from sequencing wheat. After having noted this need, we then got in collaborations with several private companies that work on computing needs for researchers. One of our collaborators has taken up the task of developing a machine learning tool to detect variants from wheat using the pangenome. Whilst the need to visualise these variants has been taken up by an American based collaborator.
First Year Of Impact 2022
Sector Agriculture, Food and Drink
Impact Types Economic

 
Title Discovery panel 
Description Selected a "discovery" panel of 280 wheat varieties that represent genetic diversity both between and within UK, France and Germany, for exome capture. This was done in collaboration with breeders across these geographies to ensure applicability to their programmes. As part of this process varieties were selected to ensure that known introgressions conferring traits of interest, such as Thinopyrum intermedium on 7D, Aegilops ventricosa on 7D, and Secale cerealis on 1B, are represented on the panel. 
Type Of Material Biological samples 
Year Produced 2021 
Provided To Others? No  
Impact 280 wheat varieties that represent genetic diversity both between and within UK, France and Germany, for exome capture were selected. 
 
Title Exome capture 
Description DNA was extracted from the panel using a high molecular weight DNA extraction kit. 137 Samples have been submitted to Arbor Biosciences for exome capture using the International Wheat Genome Sequencing Consortium (IWGSC) myBaits® Expert Wheat Exome capture panel. 
Type Of Material Technology assay or reagent 
Year Produced 2021 
Provided To Others? No  
Impact 137 Samples have been submitted to Arbor Biosciences for exome capture 
 
Title Exome capture sequencing 
Description The samples were all sequenced and data delivered for analysis 
Type Of Material Biological samples 
Year Produced 2022 
Provided To Others? No  
Impact All the project samples were exome captured and raw data dilevered. 
 
Title KASP design pipeline 
Description The pipeline streamlines the designing of KASP markers in the lab and allows for the development of very specific KASPs. The KASP markers designed using this pipeline have a 75% success rate. This has drastically changed the way we design our KASP markers. The pipeline utilises the BLAST tool for the identification of the homoeologous related hits from the context sequences of the targeted SNPs. The BLAST results are then placed into categories in which we would select the best SNPs to convert to KASPs. 
Type Of Material Technology assay or reagent 
Year Produced 2022 
Provided To Others? No  
Impact This has streamlined the process to design and validate KASP markers identified from target regions. The improvement in the success of the KASPs means that resources are not wasted on developing KASPs that would not work in the laboratory. Time and money is now being saved through the implementation of the KASP pipeline. 
 
Title Precise stock maintenance 
Description Precise stock maintenance of all 280 varieties started to provide seed stocks for phenotyping experiments later in the Fellowship. 
Type Of Material Biological samples 
Year Produced 2021 
Provided To Others? No  
Impact Precise stocks of each variety was stored and is being maintained for further use in the study. 
 
Title Bioinformatics 
Description The fellow worked closely with the internal bioinformatics service at RAGT to develop and implement sequence data analysis pipelines based on information from our pilot exome capture work. Internally known major gene markers were then utilized to validate the pipeline. 
Type Of Material Data analysis technique 
Year Produced 2021 
Provided To Others? No  
Impact the pipeline has provided us with a way to analysis all the sequence data we will generate within the project. 
 
Description Cougar 4D introgression 
Organisation University of Nottingham
Country United Kingdom 
Sector Academic/University 
PI Contribution Bioinformatics tools developed in the UKRI project we provided to the University of Nottingham to utilise in their analysis of sequence data.
Collaborator Contribution Sequence data generated from the Cougar project was provided to RAGT.
Impact The project involves pathologists, geneticists and bioinformaticians.
Start Year 2021
 
Description RAGT-Earlham-IBM Excelerate project 
Organisation Earlham Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution We will provide the bioinformatics pipeilines and data needed to validated the protocols that will be developed by IBM. The bioinformatics pipelines will be the bases upon which IBM will build the automation tools for the analysis of biological data.
Collaborator Contribution The project is focused on the automation of routine bioinformatics pipelines that have been developed in the UKRI. IBM will utilise their computing and coding expertise to develop and refine the bioinformatics pipelines that have been developed by RAGT and Earlham. IBM will prioritise the development of advanced bioinformatics pipelines in order to focus on the reduction of false positive SNP calls in order to reduce time/money wastage on marker design and testing by breeding companies such as RAGT. Machine Learning and Deep Learning based alternatives for SNP calling were candidates for comparisons to standard tools will also be explored.
Impact The collaboration is multi-disciplinary and will involve computer programmers, geneticis, bioinformaticians and breeders.
Start Year 2021
 
Description RAGT-Earlham-IBM Excelerate project 
Organisation IBM
Department IBM UK Ltd
Country United Kingdom 
Sector Private 
PI Contribution We will provide the bioinformatics pipeilines and data needed to validated the protocols that will be developed by IBM. The bioinformatics pipelines will be the bases upon which IBM will build the automation tools for the analysis of biological data.
Collaborator Contribution The project is focused on the automation of routine bioinformatics pipelines that have been developed in the UKRI. IBM will utilise their computing and coding expertise to develop and refine the bioinformatics pipelines that have been developed by RAGT and Earlham. IBM will prioritise the development of advanced bioinformatics pipelines in order to focus on the reduction of false positive SNP calls in order to reduce time/money wastage on marker design and testing by breeding companies such as RAGT. Machine Learning and Deep Learning based alternatives for SNP calling were candidates for comparisons to standard tools will also be explored.
Impact The collaboration is multi-disciplinary and will involve computer programmers, geneticis, bioinformaticians and breeders.
Start Year 2021
 
Description Conference Poster Presentation PAG 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The conference poster presentation had an international reach and managed to reach other international breeding companies. This presentation allowed me to interact with other bioinformaticians that are working in my field. The presentation sparked questions about how we tackle the needs of working with a mega-genome such as wheat.
Year(s) Of Engagement Activity 2023
 
Description Interview for International news 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact had an interview with Blandine Cailliez, a journalist from AGRO DISTRIBUTION (http://www.agrodistribution.fr/), which belongs to La France Agricole, French's largest agricultural press group. She was interested in our FLF award as the project has the potential to impact wheat breeding in a very big way. The interview higlighted the need to use genomics in wheat breeding and translating the brilliant academic research into breeding tools.
Year(s) Of Engagement Activity 2020,2021
URL http://www.agrodistribution.fr
 
Description Interview for online publication 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The aim of the interview was to feature RAGT in the "Sponsor Highlight" section of the IWGSC website. This is a section where they highlight their sponsors and their work for the wehat community. The feature is posted on the front page of the website for a couple of months and includes a brief presentation of the sponsor and an interview with the representative.
Year(s) Of Engagement Activity 2021
URL http://www.wheatgenome.org/People/Sponsor-Highlight/RAGT
 
Description Monogram bioinformatics panel 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The panel Q&A was a general discussion about challenges and opportunities for researchers and breeders in genomics and bioinformatics in mostly cereals.
Year(s) Of Engagement Activity 2021
URL http://www.monogram.ac.uk/MgNW2021.php
 
Description Professional Internship for PhD Students 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact This is a presentation to PhD students who are now obliged to conduct a 3 month placement in industry. Previously we have had two excellent JIC students in the lab who have worked on specific research projects.Therefore, the presentation was to make the post graduates aware of the FLF project and that they are welcome to come and undertake for a three month work placement with our company.
Year(s) Of Engagement Activity 2021
 
Description Workshop on "Transforming Wheat Breeding Through Integrated Data Management with Genomic Open-source Breeding informatics initiative (GOBii) and Analysis in Flapjack" 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The workshop focused on GOBii data management and data analysis in Flapjack (marker-assisted backcrossing, pedigree verification and forward breeding) developed in collaboration with the James Hutton Institute; and Galaxy (Genomic Selection). This led to engagement from both plant breeders and post graduate students
Year(s) Of Engagement Activity 2021