Comparative population genomics of red clover domestication and improvement

Lead Research Organisation: Earlham Institute
Department Name: Research Faculty

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

The aim of this project is to characterise the genetic diversity in natural and breeding populations to identify genome wide changes during breeding of red clover. We wish to investigate to what extent artificial selection has affected the genome including genic and non-genic regions, and whether this has resulted in a reduction in diversity and increase in rare alleles. We propose the following programme to test this. Firstly, we will use RAD marker polymorphisms from an existing mapping family to improve alignment of the red clover genome assembly to a genetic map. Secondly, we will use a 600 genotype strong panel of ecotypes with mostly prostrate growth habit together with erect elite breeding material to analyse the genetic diversity by RAD marker sequencing. Phenotypic, chemical analyses and ecogeographical information of the diversity panel will allow us to obtain information on how haplotype structure correlates with phenotype and environmental gradients likely to impact on environmental adaptations. Nucleotide diversity and locus by locus genetic differentiation will reveal genomic regions under selection. We will also generate mapping families segregating for growth habit, thus enabling us to associate and map more accurately the target traits relating to a prostrate and erect phenotype. The goal is to identify the genes responsible for most of the genetic variation in this key domestication trait. This project will generate information on the genetic basis of a fundamental trait, provide insight into selection during recent domestication and inform the forage legume breeding programme. We will also re-sequence the five pollen donors in the new mapping family. This, the RAD marker data and the reference assembly will give us valuable information about genome-wide linkage disequilibrium, levels of heterozygosity, SNP density and patterns of polymorphisms in coding and non-coding sequence.

Planned Impact

The delivery of the proposed objectives is focused on current needs for which suitable tools and skilled researchers are lacking or not fully developed. The objectives will also directly contribute solutions in areas of research relevant to the BBSRC's scientific priorities in food security and integrate established methods in genetics with emerging multidisciplinary research areas such as genomics and bioinformatics.
This programme will generate new opportunities for collaborative work between TGAC and IBERS and will also extend to other R&D groups in industry and academic institutions. We will communicate our results of fundamental aspects of domestication traits in forage crops through peer reviewed publications, and at national and international conferences. This work will significantly improve the draft red clover genome, which will be one of the first for a temperate forage crop. It will thus have general interest beyond the immediate circle of colleagues interested in red clover. We will disseminate the data using TGAC computing resources and by depositing the raw sequences and assemblies in long-term repositories established at the EMBL-EBI. One of the main stakeholders is Germinal Holdings Ltd who funds a significant part of IBERS breeding programme. This work is aligned closely with the initial stages of the new red clover breeding programme. The generation of large numbers of SNP marker polymorphisms will bring with it the potential to use genomics-based prediction of selection candidates. Accelerated breeding cycles lead to a faster route from breeding material to new varieties. The current programme of genetic improvement in red clover has the potential to lead to significant increases in seed sales, of which Germinal Holdings have 18% currently in the UK. New varieties with better persistence and disease resistance also open up significant opportunities for export to the European market, as well as North America and New Zealand.

The proposed work will directly impact the local community with the generation of new jobs and potential funding opportunities. The collaboration between IBERS and TGAC has strengthened the position of these Institutes in hosting specific expertise in advanced breeding for forage legumes. This will create new opportunities around the development of applications in genomics and bioinformatics, which will translate into job opportunities. It is widely recognised that the shortage of expertise and skills in biomathematics and informatics in the UK and across the world is a major risk for future development in life sciences, in this context this proposal will attract talented staff to work with IBERS and TGAC. Finally, the ecotype collection we use here consists partly of natural populations that were collected from locations in Eastern Europe in the period from the early 1990's to the early 2000's. Many of these habitats are now under severe threat, which means that some of the accessions we work with may no longer exist naturally. Our work thus represents the best way of utilising this unique resource, and preserving the diversity.

This collaboration will contribute to reinforce the UK's leadership in translating the development of genetic and genomic resources from fundamental science to applications with a potential impact on the local and national economy. The development of a more sustainable agriculture is a key aspect of the UK strategy and this is aligned with the objectives in the recently published Agri-tech Strategy. A consequence of the implementation of this proposal is to position IBERS and TGAC as international leaders in biotechnology specifically in the area of forage legumes. This will deliver impact to a broad range of stakeholders emphasising the key role that the Institutes will play in enabling researchers to develop cutting-edge science in the coming years, and ensure that the genomics resources will be translated to research and breeding programmes.

Publications

10 25 50
 
Description Red clover is one of the essential forage legume crops in temperate agriculture and a key component of sustainable intensification of livestock farming systems. Enhancing its role in sustainable agriculture requires genetic improvement of persistency, disease resistance, and tolerance to grazing. We have assembled a chromosome-scale reference genome for red clover to help address these challenges and made it available to the researchers and breeders through ensEMBL, Phytozome and LegumeBase. Our analysis evidenced that red clover recently diverged from the model legume Medicago truncatula, and most of the findings in it could efficiently be applied in red clover.
We have used this reference to analyse three red clover populations with the ultimate goal of bringing novel donor accessions to the breeding programme. We sequenced five accessions from the centre of origin of the species. We also integrated the genome assembly with a biparental genetic map segregating for growth habit (erectile or prostatic), a predictive trait for persistency in several temperate forages.
Exploitation Route Our reference is publically available in three genome databases (ensEMBL, Phytozome and LegumeBase) and being used in 82 studies to date; https://scholar.google.co.uk/scholar?oi=bibs&hl=en&cites=343892818495830346
Our collaborators at IBERS lead the temperate forages improvement programme and have used this reference to analyse a collection of accessions from the genebank and incorporate new material in the breeding programme. We have identified markers and later donor accessions with prostatic growth, which have been included in new crosses by the breeders at IBERS, aiming for improvements in persistency.
We have used genotyping by sequencing (GBS) to determine the genetic variation and population structure in red clover natural populations from Europe and Asia and varieties or synthetic populations. Also, the decay rate in linkage disequilibrium was fast, and no significant evidence of any bottlenecks was found.

A genome-wide association study identified a single nucleotide polymorphism (SNP) located in a homologue of the VEG2 gene from pea, associated with flowering time. Identifying genetic variation within the natural populations is likely to be useful for enhancing the breeding of red clover in the future.
Sectors Agriculture, Food and Drink

 
Description The sequencing data generated by this project has been used by IBERS' breeding programme to enrich the gene-pool and "Aberystwyth University Seed Biobank" to classify the material in their seed banks. Growth habit (prostatic or erectile) is a proxy trait for persistency, a prominent trait target by IBERS' red clover breeding. Breeders have also used this material to improve their genomic selection pipelines. We will continue to work with our collaborators in IBERS to realise the value of the data generated by this project.
First Year Of Impact 2018
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic,Policy & public services

 
Description Darwin Tree of Life
Amount £9,360,421 (GBP)
Funding ID 218328/Z/19/Z 
Organisation Wellcome Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 11/2019 
End 05/2022
 
Description John Iness Foundation internship in data-driven plant bioinformatics
Amount £17,500 (GBP)
Organisation John Innes Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 05/2021 
End 05/2022
 
Title PRJEB34003 - Red clover Trifolium pratense high-quality genome 
Description Red clover Trifolium pratense high-quality genome assembly USING OUR PIPELINE DEVELOPED IN WP1 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact Used as the reference for a new dataset for a GCRF funded project. 
 
Title Trifolium pratense reference 
Description We assembled 309 Mb of the red clover genome in 39,904 scaffolds. Half of the assembly was contained in 353 scaffolds (N50 = 223 Kb), while 1054 scaffolds longer than 50 Kbp contained another 25%. We annotated 40,868 genes and 42,223 transcripts. Of those, 22,042 genes were anchored onto the seven chromosomes. The reference is available through ensEMBL (https://plants.ensembl.org/Trifolium_pratense/Info/Index), Phytozome (https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Tpratense) and Legumebase (https://legumeinfo.org/genomes/gbrowse/Tp2.0) 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
Impact Enhancing red clover's role in sustainable agriculture requires genetic improvement of persistency, disease resistance, and tolerance to grazing. To help address these challenges, we assembled a chromosome-scale reference genome for red clover. We observed large blocks of conserved synteny with the model legume Medicago truncatula and estimated that the two species diverged ~23 million years ago. Among the 40,868 annotated genes in red clover, we identified gene clusters involved in biochemical pathways of importance for forage quality and livestock nutrition. 
URL https://plants.ensembl.org/Trifolium_pratense/Info/Index
 
Description Aberystwyth University - IBERS, in temperate and tropical forages 
Organisation University of Wales
Department Institute of Biological Environmental and Rural Sciences (IBERS)
Country United Kingdom 
Sector Academic/University 
PI Contribution Genetic map integration with genome assemblies. SNP and diversity calling in natural and induced populations of forages.
Collaborator Contribution Genetic map assembly and validation. Genotyping-by-synthesis (GBS) library preparation and sequencing for diversity analysis. Gene and transcriptomic analysis.
Impact Chromosome-scale genome references for Trifolium pratense, Lolium perenne and Brachiaria ruziziensis.
Start Year 2014
 
Description Agrosavia-Earlham Institute MoU 
Organisation Colombian Agricultural Research Corporation
Country Colombia 
Sector Charity/Non Profit 
PI Contribution Genomic approaches to access the crop diversity at the Colombian National Germplasm collection hosted by Agrosavia/Corpoica
Collaborator Contribution Making available genetic resources and evaluation trials
Impact Two pilot projects on Musa accessions and legume forages to identify genome-wide trait-SNP associations for molecular breeding.
Start Year 2019
 
Description Bioinformatics for Breeding 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The practical course featured a collection of methods and bioinformatics tools fundamental for modern breeding, especially for crops. Next generation sequencing (NGS) has made large collections of open-source diversity genomic data possible, such as SNPs, that can be used as molecular markers for breeding. Combined with phenotypes, genome-wide association studies provide breeders with an understanding of the molecular basis of complex traits. Content: SNP calling/discovery and SNPs effects and context; NGS techniques for genotyping; Genetic markers, linkage analysis, and genetic maps; High-throughput phenotyping and image analysis; Association mapping (GWAS); Genome-wide predictions, modelling and simulations; Genomic selection.
Year(s) Of Engagement Activity 2017
URL http://www.earlham.ac.uk/bioinformatics-breeding