Molecular marker-assisted plant breeding on a genome wide scale

Lead Research Organisation: Aberystwyth University
Department Name: IBERS

Abstract

Maintaining or increasing agricultural food production and security is a priority in order to meet the needs of a growing population. This challenge is put into further focus by climate change and the need to reduce the environmental footprint of agriculture. There is thus an urgent need to increase the speed of improvement of crop varieties in terms of yield and increased efficiency of use of resources, such as fertiliser and water. Genetic improvement of these traits in crop plants has been achieved by plant breeding on the basis of selection and crossing of phenotypically superior plants. In the last 20 years or so molecular markers have been used in some breeding programmes, but largely on an ad hoc basis for improvement of a few target traits. The advent of more affordable high throughput (next generation) sequencing and genotyping in the last five years has made it possible to make use of molecular markers in a more comprehensive way than hitherto. We refer to genomic selection (GS) which represents a novel way to improve the phenotype of complex agronomic and biological traits governed by many genes each with a small effect. GS is already beginning to transform the breeding of livestock such as cattle and pigs, but has yet to make an impact at a practical level for crop plants. GS is selection based on the collective composition of molecular markers densely covering the entire genome. The proposed collaboration between the Institute of Biological, Environmental and Rural Sciences (IBERS) and the Computer Science Department at Aberystwyth University gives us an opportunity to test GS empirically and theoretically. IBERS is the only university department in the UK with plant breeding programmes, and we will use this unique position by exploiting our perennial ryegrass breeding programme. It is based on repeated cycles of recurrent selection and crossing and is well suited for GS, as we have comprehensive phenotypic data for the current generation and earlier generations of this successful scheme. We will use the current generation of motherplants as a "training population" by genotyping it with over 3000 molecular markers covering the entire genome. The aim is that at least one molecular marker is close to a genomic region influencing the phenotype of interest (quantitative trait locus or QTL). The phenotypic data already available from the breeding programme will be combined with the genotype data to generate complex prediction models using established statistical methods, but also state-of-the-art machine learning techniques developed at the Computer Science Department, for the calculation of a genomic estimated breeding value (GEBV), and to test the performance of the models in the breeding programme. The computational models are then used to calculate the GEBV in a validation population, which is different from the training population, using only genotypic data. The resulting GEBV will be used to select individuals for progeny production based on genotype only. Given a dense coverage of the genome, the combined effect of many QTL for the same trait can be improved measurably by incorporating the effect of all alleles simultaneously. This approach will be particularly advantageous in perennial crops, such as ryegrass and other forages, as the need for lengthy plot trials can be reduced. However, this is not the only benefit of GS. The genomic and statistical resources and models developed here will provide us with a platform for discovery of genes and facilitate the unravelling of the architecture of complex traits of agronomic and biological importance.

Technical Summary

The advent of ultra-high throughput DNA sequencing and genotyping generates opportunities to enhance the rate of genetic gain in breeding programmes by combining phenotypic selection with faster molecular breeding approaches. Marker assisted selection has been used largely as an add-on to existing phenotype selection. The objective of this project is to show how genomic selection (GS) in IBERS perennial ryegrass breeding programme can be used to demonstrate that we can reduce the breeding cycle from the current 4 years to 2 years. In GS, many markers are scored across the entire genome in a "training population". This has also been phenotyped, so that models can be developed in which all the markers are used to jointly explain all the genetic variance, and thus make predictions on the breeding value. We will use the 170 and 54 motherplants of the current generation of the two ryegrass breeding populations as training sets as they have been comprehensively phenotyped for a range of traits. We are developing a genotyping platform by next generation sequencing from which we will use a panel of 3072 SNP markers. The genotypic and phenotypic data will be used to generate models for predicting the genomic estimated breeding value (GEBV) with Bayesian regression methods. We will also use state-of-the-art Knowledge Discovery in Databases and machine learning techniques to more efficiently deal with finding predictive relationships between phenotype and genotype. The accuracy of the prediction models will be determined by correlation between GEBV and the true breeding value in half sib progeny from the motherplants, from which the next generation is selected, as well as in historical data and existing varieties. The sequence and genotype resources, and the science underpinning GS that will be developed here has synergy with other plant and animal genetic improvement programmes, and for discovery of genes governing complex traits of agronomic and biological significance.

Planned Impact

The main impact will be on plant breeders and geneticists, particularly those concerned with population based genetic improvement programmes. One of the outcomes of this work will be to deliver models and equations for predicting the breeding values of individual genotypes based on genome wide molecular markers. In the particular crop of perennial ryegrass, which is used as the model, selection on the basis of the combined genotype will deliver a time saving of two of the four years of the present phenotypically based breeding cycle. This research thus has great interest for breeding companies and research institutes engaged in genetic improvement of crops.
Increased speed of genetic improvement of forage grasses will also benefit livestock farmers, and have a beneficial impact on food security and consumers.

The project will have wider interests for the academic research community by providing a large number of validated SNP polymorphisms for perennial ryegrass, but more generally will facilitate high throughput molecular marker assisted research to assess the diversity of grass populations and their usefulness for incorporating novel trait characteristics into breeding populations. It will pave the way for more genuinely genome-wide association mapping projects, which have the potential to elucidate individual genes underlying complex traits, such as water soluble carbohydrates and digestibility, to mention a few of high relevance for forage crops. More widely, genomic selection will have a major impact on our ability to unravel the genetic architecture of agronomically and biologically important complex traits, which is key to improving new traits in the longer term.

It will also benefit the research community in terms of contributing models and equations for genomic estimation of breeding value using computational biology and machine learning methods for genomic selection directly relevant for a breeding programme, and thus further demonstrating the beneficial impact of computational biology on genetic and genomics assisted plant research.

The partnership between IBERS and Germinal Holdings Ltd provides us with real opportunities to demonstrate scientific excellence with impact by allowing the potential of GS to be captured for the production of new varieties, and thus showcasing the application of GS in plant breeding.
 
Description The objective of this project is to apply genomic selection approaches to the existing perennial ryegrass breeding programme at Aberystwyth University. To do this we have developed a 3500 SNP marker CHIP for genotyping the breeding population. The marker data have been combined with phenotypic data available from the breeding programme, and various statistical and machine learning programmes have been applied to the data in order to obtain genomically estimated breeding values (GEBV) of selection candidates. Selection of parental material for generation of new varieties have been made on the basis of GEBVs derived from this project, and the performance of synthetic populations will be compared to controls based on phenotypic selection and to a population based on random selection of parents. Preliminary results will be available in the summer of 2015. We have carried out extensive cross validation estimates based on different prediction models, and the best results have been achieved with ridge regression BLUP and Random Forest predictions.
Exploitation Route The prediction models we have developed in the project are already been used now to make selection decisions for the generation of new varieties of perennial ryegrass. The targets are a combination of traits such as biomass yield, digestibility, persistency and water soluble carbohydrate content. The models are also being used in related projects to improve the content of fatty acid composition and concentration in ryegrass germplasm.
The outcomes of the project have been taken forward by expanding the training population from around 250 in the original project to approximately 900 through other funding streams. This had led to a substantial rise in prediction accuracy of the key traits mentioned above, so that accuracy for digestibility and other quality traits is around 0.7-0.8, and biomass yield in the second harvest year is >0.6. This represents a step change for IBERS grass breeders, and has significantly increased confidence in the methodology. They are now using the GEBVs to compose the best parent combinations for specific trait improvement before phenotypic data have been produced from the progeny testing.
Sectors Agriculture, Food and Drink,Education

 
Description The prediction models developed here are now being used to make selection decisions in the current breeding programme at Aberystwyth University's Forage Plant Breeding team. A total of 5 synthetic populations were generated and underwent phase 1 trials together with synthetics selected on the basis of phenotypic data. None of the genomic populations were taken forward to national list trials, but since this initial attempt, prediction accuracies have increased significantly as the training population has increased significantly. The new prediction model will be used to compose synthetic populations from the most recent generation of the breeding population.
First Year Of Impact 2015
Sector Agriculture, Food and Drink,Education
Impact Types Economic

 
Description Agricultural production
Amount £550,000 (GBP)
Funding ID 93314-562353 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 11/2017 
End 10/2020
 
Description BBSRC International Partnering Awards
Amount £30,000 (GBP)
Funding ID BB/L027011/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 05/2014 
End 11/2016
 
Description Crop and livestock disease challenge
Amount £235,000 (GBP)
Funding ID TS/M00886X/1 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 02/2015 
End 01/2020
 
Description Walsh Fellowship
Amount € 145,000 (EUR)
Organisation Irish Research Council 
Sector Public
Country Ireland
Start 05/2017 
End 04/2021
 
Title Prediction models for genomically estimated breeding values in perennial ryegrass 
Description Development of statistical models for predicting genomically estimated breeding values in IBERS perennial ryegrass recurrent selection breeding programme. These are based on training populations with phenotypic and genotypic data. 
Type Of Material Improvements to research infrastructure 
Year Produced 2014 
Provided To Others? Yes  
Impact The prediction models were used to select motherplants to be crossed for production of potential new varieties of perennial ryegrass. The progeny from the crosses is compared with those obtained from phenotypic selection. 
 
Title R scripts for manipulation of phenotypic and genotypic data and genomic prediction 
Description Collection of functions used for a research project on genomic selection of Lolium perenne (L.). Functions in this file cover several functionalities: 1.manipulation and transformation of genotypic matrices 2.manipulation and transformation of phenotypic data, including data with repeated observations 3.easing work with the rrBLUP package 4.writing FASTA files from a given file of contigs 5.functions for general statistical analysis and modeling 6.general purpose utility functions (string manipulations, etc.) 
Type Of Material Technology assay or reagent 
Provided To Others? No  
Impact Manipulation and transformation of future raw phenotypic datasets from the Lolium perenne breeding programme for inclusion in further genomic prediction based breeding decisions is easier with these scripts. 
URL https://github.com/stas-g/J006955
 
Description Colombia and Kenya Partnering Award: Skills sharing for genomic approaches to forage improvement 
Organisation CGIAR
Department International Center for Tropical Agriculture
Country Colombia 
Sector Charity/Non Profit 
PI Contribution Participation in workshop on genomic selection assisted breeding of tropical forage crops
Collaborator Contribution Breeder participation and visit for training purposes
Impact Visit of plant breeder from CIAT to TGAC in the summer of 2016. Forages for Africa programme also initiated with participation from IBERS.
Start Year 2014
 
Description Colombia and Kenya Partnering Award: Skills sharing for genomic approaches to forage improvement 
Organisation Earlham Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Participation in workshop on genomic selection assisted breeding of tropical forage crops
Collaborator Contribution Breeder participation and visit for training purposes
Impact Visit of plant breeder from CIAT to TGAC in the summer of 2016. Forages for Africa programme also initiated with participation from IBERS.
Start Year 2014
 
Description Colombia and Kenya Partnering Award: Skills sharing for genomic approaches to forage improvement 
Organisation International Livestock Research Institute (ILRI)
Country Kenya 
Sector Charity/Non Profit 
PI Contribution Participation in workshop on genomic selection assisted breeding of tropical forage crops
Collaborator Contribution Breeder participation and visit for training purposes
Impact Visit of plant breeder from CIAT to TGAC in the summer of 2016. Forages for Africa programme also initiated with participation from IBERS.
Start Year 2014
 
Description Article in British Dairying about the Genomic selection programme in perennial ryegrass 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact An article in British dairying March 2015, Vol 21, no 5, pp 54-55 which describes the progress and the promise of genomic selection in IBERS forage crop breeding programmes. The article was aimed at dairy farmers and related professions.
Year(s) Of Engagement Activity 2015
 
Description Bioinformatics for Breeding 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Participated as a teacher on the Bioinformatics for Breeding course held at the Earlham Institute, Norwich over three days in February 2017. About 30 researchers with an active interest in how to use bioinformatics in breeding-related research such as next generation sequence data for genetic mapping, GWAS and genomic selection.
Year(s) Of Engagement Activity 2017
 
Description CGIAR Workshop on Implementing Genomic Selection in CGIAR Breeding Programs, Montpellier, France, 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop to discuss and outline ways forward for implementing genomic selection in CGIARs diverse range of breeding programs.
Year(s) Of Engagement Activity 2015
 
Description Lecture to 3rd year undergraduate students 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact Lecture to undergraduate student BR31420, Bioinformatics and functional genomics
Year(s) Of Engagement Activity 2015
 
Description National Institutes of Bioscience 2013 conference, Roslin 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited as session speaker at the National Institutes of Bioscience 2013 conference in Roslin. The title of the talk was: Marker assisted plant breeding on a genome-wide scale. The audience was colleagues from other BBSRC funded institutes and universities.
Year(s) Of Engagement Activity 2013
 
Description Participation in International Workshop in Abuja, Nigeria, 2015. The workshop was entitled: Future Proofing Agriculture Production Against Environmental Change 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Participation in workshop organised by the British Council on Future Proofing Agriculture Production Against Environmental Change. The title of my talk was: Tapping the Potentials of Germplasm Variation Using Genomic and Post-Genomic Approaches.
Year(s) Of Engagement Activity 2015
 
Description Plant and Animal Genome Conference XIX, San Diego, California, 2013 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Oral presentation of our work on genomic selection in the ryegrass breeding programme at the International Lolium Genome Initiative. Title of talk: Genomic selection in ryegrass
Year(s) Of Engagement Activity 2013
 
Description Plant and Animal Genome Conference XXIII, San Diego, California 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Oral presentation of results of our work on genomic selection in perennial ryegrass at the International Lolium Genome Initiative Workshop. Title of talk was: Machine Learning for Genomic Prediction in Lolium perenne.
Year(s) Of Engagement Activity 2015
 
Description Training course at IBERS for 50 advisors 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Training for advisors relating to forage crop breeding and grassland management
Year(s) Of Engagement Activity 2015,2016
 
Description Visit by Michael Norriss of PGG Wrightson Seeds 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Michael Norriss from PGG Wrightson Seeds visited IBERS to hear about forage breeding activities at IBERS and discuss possibilities of collaboration
Year(s) Of Engagement Activity 2018
 
Description Workshop on Genomic Selection in Animals and Plants. Norways University of Life Sciences, Oslo 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation of plenary lecture entitled: Potential use of GS and association mapping in plant breeding. This was organised by the Life Sciences University in Oslo. The primary audience was colleagues and postgraduate students from the university.
Year(s) Of Engagement Activity 2012