GENADAPT - Genotypic and Environmental Adaptation through Data Driven Prediction Techniques

Lead Research Organisation: University of Leeds
Department Name: Sch of Biology

Abstract

The sustainable intensification of UK agriculture is a major challenge that will require many innovations to address. One key aspect centres on the need to improve our cereal crops to enable higher yields in the same land area, and under variable and unpredictable climates. The primary cereal crop in the UK is wheat (Triticum aestivum), the yield of which is highly variable depending on environmental conditions. However, the development of improved and regionally adapted wheat is a slow process which remains reliant on trial and error of multiple crosses and in-field phenotypic selection. A further limitation is that the current approach produces a new cultivar which is improved only for the region it has been selected in, therefore the process has to be repeated in multiple different regions making it extremely labour intensive.

Ideally, we need a method which will accelerate the adaptation process so that fewer crosses are required and therefore less time is wasted growing and phenotyping plants. As part of this, the use of mathematical models has been important and has resulted in accelerated methods for future yield predictions. However, mathematical modelling has so far not fully utilised the new wave of data developed from genomic selection and large scale field-based phenotyping.

In this project we will combine the genetic, environmental and field phenotyping data to enable genetic-based predictions for target environments. We will achieve this by combining our understanding of the genes involved in flowering time adaptation in bread wheat machine learning models that can test genetic hypothesise. Through controlling the genetic combinations which are used by the machine learning models we will be able to derive new understanding regarding novel genetic combinations and how the defined genetic combinations perform under specified environmental conditions. We will then challenge this new understanding by measuring flowering time responses under controlled cabinet conditions which mimic the environmental conditions used in the model.

The outcomes of this project will be the development of genetically driven machine learning models which can make precise predictions regarding flowering time of our primary arable crop, wheat. These predictions will be experimentally tested under realistic conditions in controlled growth cabinets. Secondly, the project will provide a practical framework which can be applied to new environmental conditions and therefore for different target countries.

Technical Summary

Within the UK we urgently need to develop methods which will enable the rapid sustainable intensification of our arable agriculture. Arable crop production is a major industry which underpins our food security. However, we have not invested in methods which enable advances in crop breeding. Most notably, wheat breeding remains largely reliant on large numbers of plant crosses and manual selection. This is time consuming and labour expensive.

We are proposing a method which can accelerate this process for our primary arable crop, bread wheat (Triticum aestivum). The method, which utilises the data explosions in genotyping, field phenotyping and environmental recording will use machine learning to identify how new genetic combinations would be adapted to multiple environmental conditions. To achieve this we are going to identify genetic clusters which will be used as an input to machine learning models. The models will also use field phenotyping and corresponding environmental data. Machine learning models have the capacity to identify networks which are not constrained by arbitrarily generated parameters. Additionally they can utilise very large input datasets, enabling multiple important components of plant phenotype to be included when considering adaptation. Specific predictions made by the model will be tested experimentally. This will be achieved through the selection of specific germplasm with the required genetic clusters, grown under precisely replicated environmental conditions in controlled environment chambers.

The outcomes of this project will be the development of genetically driven machine learning models which can make precise predictions regarding flowering time of our primary arable crop, wheat. The project will also provide a practical framework which can be applied to new environmental conditions and therefore for different target countries.

Publications

10 25 50