Developing next generation genetic improvement tools from next generation sequencing

Lead Research Organisation: SRUC
Department Name: Research

Abstract

Phenotype and pedigree informed genetic improvement in livestock using techniques such as Best Linear Unbiased Prediction (BLUP) has seen rates of improvement of between 1% to 3% per annum in many livestock populations. Over the past 50 or so years we have seen that genetics research, including molecular and statistical developments, has been applied by many operational plant and livestock breeding programmes. The availability of reference sequences for many species has resulted in the discovery of very many thousands (and higher) of single-nucleotide polymorphisms (SNPs) leading to the on-going development of low-cost SNP arrays and being used around the globe in many livestock species - genomic selection. Through research and industry (nationally and internationally) the UK dairy industry implemented genomic selection for industry traits (milk production, fertility, longevity) in April 2012 using a pooled collaborative SNP genotype file (predominately bulls, now over 100,000 individuals). This will lead to an expected increase in the annual rate of genetic improvement of approximately 30-50%.

The next horizon for research and its translation into genetic improvement tools is the inclusion of sequence data alongside the tools that the industry have already invested in. These developments provide exciting opportunities for the research community to explore the more readily available and vast amounts of genomic data to create new knowledge and drive innovation in the field, as we have seen historically in plant and livestock genetic research. Because the UK dairy industry and research sectors are collectively likely to invest heavily in sequence information in the coming years, a collaborative strategy to generate, store and process sequence information efficiently is needed to enable its effective used in animal breeding research and for use in next generation genetic improvement tools. The rate of change we are now experiencing in ready availability of genomic and sequence information means there a real need to take a community based approach to utilising these data, including the involvement of the end user as well as basic research. In the case of this proposal, the end-user focus is the animal breeding industry as well as the biosciences research community.

A DNA sequence captures the complete genome of an individual. If available for sufficient individuals, it will provide a range of benefits 1) greater livestock improvement through more accurate and more persistent genomic selection, 2) the identification of targets for genome editing, 3) detection and breeding management of rare variants, including recent mutations, and 4) greater biological knowledge. More powerful biological discovery will be enabled because the causal nucleotides are contained within the sequence, unlike the case of markers such as single nucleotide polymorphism (SNP).

Its exploitation in animal breeding programmes is expected to create a paradigm shift that will greatly enhance the production of food from farmed livestock through both increased output and reduced wastage. However, it is expensive to collect sequence data at the high read rates (essentially accuracy) needed generally for research and this has led to small islands of sequence data at research institutes. Commercial breeding companies are beginning to assimilate some sequence data but this is usually IP protected. Some have large genotype datasets that can be imputed to full sequence level. The aim of this project is to develop the methodology(s) to optimise the distribution of sequencing effort across, and in key populations within, the UK dairy cattle population. The hypothesis that will be studied is that the inclusion of optimal sequence data will improve the results for genome wide association studies for novel traits and genomic selection in the wider UK dairy cattle population compared to widespread SNP chips alone.

Technical Summary

The UK dairy research and industrial sectors, as well as RCUK, will invest heavily in sequence data in the coming years. This proposal will ensure that the infrastructure and strategy underpinning this investment is optimal. The platform generated will allow for the combination and analysis of both genotype and sequence information on the UK (dairy) cattle population, and will serve as a technology example for other livestock species. There are three components to this proposal. The first will develop a strategy for collecting sequence information across a whole population. The second will put in place the infrastructure required to handle, share, analyse, and utilise inter(national) shared sequence data within the UK dairy breeding and research communities. The third will evaluate the ability of sequence data to increase the accuracy of genomic selection and the power of genome wide association studies in national dairy cattle populations. Through these activities this project will develop the strategy, infrastructure, and expertise for sequence data required by the UK breeding sector and through this will help maintain world class standards in both the science and application of dairy production in the UK.

Planned Impact

The future sustainability of the UK dairy industry relies on farmers being able to respond to key market signals and future developments in genetic improvement tools are likely to be a key. The partners have successfully worked together to deliver R&D that drives the ongoing changes in the genetic and genomic improvement tools used by farmers. For example, SRUC has delivered information on, and produced, practical dairy selection tools, particularly the inclusion of fertility, health, welfare and survival traits. Adoption of new indexes have improved animal health and welfare and economic performance compared to continued use of previous selection practices, and has cumulatively reduced greenhouse gas emissions per breeding animal by 1.4%/yr. The overall annualised economic benefits of the genetic improvement that has taken place in the years 1980-2009 is worth £127 million/year to the UK dairy industry.
Globally and within the UK dairy production is important both commercially and for research. Dairy genetics is particularly important for research because, due to its high commercial value, it serves as a model species for all other livestock. For example, dairy cattle breeding was the first to widely adopt genomic selection and was the sector that undertook many of the innovations in genetic and genomic evaluations.
Sequence data is expensive and therefore requires a unified approach involving all stakeholders. In dairy this includes research organisations, farmers, private companies, levy boards, and research funding agencies. In this proposal we will bring these groups together by developing a unified strategy for the collection, handling, and utilisation of sequence information within the UK dairy industry.
(i) The academic community. Sequence data is of great value for the prediction and understanding of quantitative traits. However, small islands of sequence data are of limited value on their own - to benefit from sequence huge quantities are needed - quantties that are far beyond the scope of a single researcher or a single research project. This proposal will put in place the strategy and infrastructure to generate the quantities of sequence data that is required by current and future researchers.
(ii) Commercial sequence and genotype providers. Companies providing SNP or sequence data will be able to add value to the data that they generate.
(iii) Society. All members of society who work to improve or depend upon the competitiveness and sustainability of agriculture will benefit from the downstream practical applications. The application of the research by breeding organisations will lead to faster and more sustainable genetic progress, leading to healthier food, and food production that is more resource efficient and affordable. Increased efficiencies in agriculture has direct societal benefits in greater food security with less environmental impact.
(iv) UK science base. The proposed research will provide a platform for increased R&D capabilities in the UK, maintaining its scientific reputation and associated institutions, with increased capability for sustainable agricultural production.
(v) Training. The proposed research will be embedded within training courses that the investigators are regularly invited to give, and the post-docs and programmer working on the project will have the opportunity to be trained in world leading livestock genetics community in a cutting edge area of research. Further, this project is aligned to delivery of already awarded projects (involving postdocs and PhD). This cohort of early career researchers will greatly benefit from the activity of the community.
(vi) Funding agencies. Funding agencies will be required to fund sequence data in the coming year. This research will put in place the tools and ideas required to optimise this investment and to maximise its complementarity to investments from other organisations (e.g. breeding companies).
 
Description This project developed new algorithms, informatics pipelines and data processing that allow us to integrate genetic, genomic and phenotypic information to create enhanced genetic evaluations for a range of species and scenarios.
These include ways of (i) rebuilding and filling in gaps in pedigree of animals from combination of data sources (ii) systems that integrates data and advises target animals for specialised genotyping and phenotyping (iii) integrating with other research outputs identifies mating advice for the management of genetic improvement in chosen population, balancing genetic gain with genetic diversity and (iv) enhanced (accuracy and range of traits available) genomic predictions from integrated data systems.
Exploitation Route The results can be used by industry stakeholders that have input into breeding programme design and delivery.
Sectors Agriculture, Food and Drink

 
Description We have created a bioinformatic pipeline that integrates sequence, genotype and pedigree on animals at different levels of details (e.g., sparse vs full pedigree, small vs many SNP genotypes, partial vs whole genome sequence). The point of integrating these data is to help integrate these data and impute missing genome information. These systems have been used to undertake work on this and related projects on dairy and beef cattle from the UK and Africa. This has been tested on population level data with over 250,000 UK dairy industry animals with a range of pedigree, genotype and whole genome sequence data. Systems have been further extended to beef, sheep, goats and pig populations and used in research settings but also the delivery of genomic based breeding solutions for these populations
First Year Of Impact 2019
Sector Agriculture, Food and Drink
Impact Types Societal,Economic

 
Description Allocation of Research Excellence Grant
Amount £46,198 (GBP)
Organisation Government of Scotland 
Department Scottish Funding Council
Sector Public
Country United Kingdom
Start 10/2017 
End 03/2018
 
Description Allocation of Research Excellence Grant
Amount £62,000 (GBP)
Organisation Government of Scotland 
Department Scottish Funding Council
Sector Public
Country United Kingdom
Start 06/2015 
End 05/2016
 
Description BBSRC Impact Accelerator
Amount £150,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 08/2017 
End 03/2018
 
Description CONCEPTION TO CONSUMPTION: aligning farmers to consumers using modern data, decision support and precision agriculture techniques.
Amount £2,395,518 (GBP)
Funding ID 105156 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 05/2019 
End 03/2022
 
Description Grant Award
Amount £10,000,000 (GBP)
Organisation Bill and Melinda Gates Foundation 
Sector Charity/Non Profit
Country United States
Start 09/2015 
End 09/2020
 
Description Horizon 2020
Amount € 7,000,000 (EUR)
Funding ID 727213-2 
Organisation European Union 
Sector Public
Country European Union (EU)
Start 05/2017 
End 04/2022
 
Description Horizon 2020
Amount € 7,000,000 (EUR)
Organisation European Union 
Sector Public
Country European Union (EU)
Start 02/2016 
End 01/2020
 
Description RESAS Strategic Reserach Portfolio
Amount £25,000,000 (GBP)
Funding ID Work package 2.3 Agricultural Systems 
Organisation Government of Scotland 
Department Scottish Government Rural and Environment Science and Analytical Services Division (RESAS)
Sector Public
Country United Kingdom
Start 04/2016 
End 04/2021
 
Description RESAS funded studentship
Amount £80,000 (GBP)
Organisation Government of Scotland 
Sector Public
Country United Kingdom
Start 10/2014 
End 03/2018
 
Title Imputation from low density genotypes to whole genome sequence 
Description As part of this project a database system/pipeline was developed to help impute genotypes on commercial cattle to full whole genome sequence. As this is aiming to work with potentially hundreds of thousands of animals a compute efficient pipeline needed to be developed and integrates with industry genotypes, potentially commercial sensitive and therefore protected. Part of this step involved integrating national genotype databases with a DRAGEN server - used to align sequence data. The system aligns and imputes to whole genome sequence approximately 50% faster than previous methods. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? Yes  
Impact The alignment of cattle sequence data using DRAGEN, including imputation was used by a research group in Roslin studying the genomics of TB resistance. 
 
Title Cattle genotypes and sequences database 
Description A database system was developed to manage genotypes and sequence for UK cattle with multiple contributors (commercial and research) 
Type Of Material Data handling & control 
Year Produced 2016 
Provided To Others? Yes  
Impact The system for managing national genotypes (and sequence) data has been used by a number of other research groups that have permissions to access said data. Further, when "public" data are pooled there is an ability to query and request access 
 
Description Dairy Australia MoU 
Organisation Dairy Australia
Country Australia 
Sector Public 
PI Contribution This Memorandum of Understanding agrees to jointly work on the development of milk mid-infrared prediction tools to help dairy farmers manage and select their cows, combining of genomic information and the integration of genomic and milk mid-infrared data.
Collaborator Contribution The partners, by sharing of data, will help to further improve the impact of the original BBSRC project(s) after the projects have ended
Impact No outputs as yet. Not multidisciplinary
Start Year 2017
 
Description Genomes Canada - Efficient Dairy Genome Project 
Organisation University of Alberta
Country Canada 
Sector Academic/University 
PI Contribution SRUC, has made an in-kind contribution in terms of sharing of data on a) daily feed intake, monthly weights for four yrs in UK and b) Individual animal cost to measure emission in a 24hr assessment with Laser methodology. SRUC will oversee the international work on the development of new milk mid-infrared prediction equations for novel traits and thus add to the value of the data/activity in this and linked project. SRUC will contribute genotype and sequence data on a subset of cows with milk mid-infrared phenotypes to enhance potential genomic predictions for novel traits.
Collaborator Contribution International demand for dairy products is set to expand in concert with the middle-class of emerging economies, the need for high quality milk protein in developing countries and world population expansion. Already, the Canadian dairy industry generates $16.2 billion to this country's GDP (2011). The current proposal looks to address increasing demand, and the global competitiveness of Canada's dairy cattle industry both on-farm and in exporting Canadian dairy genetics. Canadian industry stands to gain $100M annually by improving two key traits in cattle: 1) feed conversion (feed efficiency) toward increased milk production, and 2) reduced methane emission. This project offers the means for effective selection of advantageous feed efficiency and reduced methane emission traits for a more secure and sustainable supply of competitive Canadian dairy products. Using genomics-based approaches to define natural variation between animals, cattle will be selected for higher feed efficiency and lower methane emissions. Canadian dairy producers will have access to bulls whose daughters are more efficient at converting feed into milk and have lower greenhouse gas emissions with the same level of production. As feed is currently the largest expense in milk production, improving cow efficiency will substantially benefit industry members financially. More efficient animals also produce less manure waste, further contributing to a decreased environmental footprint for industry. Presently, it is very difficult and expensive to collect the data (phenotypes) required for genetic improvement of feed efficiency and methane emissions. To date, there has been no large-scale direct selection for these traits in breeding dairy cattle. The latest genomic approaches offer an opportunity to address this, but accurate phenotypes are required for genetically-representative animals from the Canadian population. Therefore, this project focuses on using genomics to collect important data for calculating individuals' genetic merit. Industry breeding strategies can then incorporate these two traits in developing optimal populations, even for young animals without phenotypes. Involvement of international research partners and industry networks ensures standardization of necessary new data, and broad application of project outputs for the benefit of Canada's dairy industry and global food security and sustainability.
Impact There are no direct impacts from this work to date. The project is multidisciplinary involving livestock genetics researchers, bioinformaticians, ruminant nutritionists and socio-economists
Start Year 2015
 
Description Genomes Canada - Efficient Dairy Genome Project 
Organisation University of Guelph
Department Department of Animal Biosciences
Country Canada 
Sector Academic/University 
PI Contribution SRUC, has made an in-kind contribution in terms of sharing of data on a) daily feed intake, monthly weights for four yrs in UK and b) Individual animal cost to measure emission in a 24hr assessment with Laser methodology. SRUC will oversee the international work on the development of new milk mid-infrared prediction equations for novel traits and thus add to the value of the data/activity in this and linked project. SRUC will contribute genotype and sequence data on a subset of cows with milk mid-infrared phenotypes to enhance potential genomic predictions for novel traits.
Collaborator Contribution International demand for dairy products is set to expand in concert with the middle-class of emerging economies, the need for high quality milk protein in developing countries and world population expansion. Already, the Canadian dairy industry generates $16.2 billion to this country's GDP (2011). The current proposal looks to address increasing demand, and the global competitiveness of Canada's dairy cattle industry both on-farm and in exporting Canadian dairy genetics. Canadian industry stands to gain $100M annually by improving two key traits in cattle: 1) feed conversion (feed efficiency) toward increased milk production, and 2) reduced methane emission. This project offers the means for effective selection of advantageous feed efficiency and reduced methane emission traits for a more secure and sustainable supply of competitive Canadian dairy products. Using genomics-based approaches to define natural variation between animals, cattle will be selected for higher feed efficiency and lower methane emissions. Canadian dairy producers will have access to bulls whose daughters are more efficient at converting feed into milk and have lower greenhouse gas emissions with the same level of production. As feed is currently the largest expense in milk production, improving cow efficiency will substantially benefit industry members financially. More efficient animals also produce less manure waste, further contributing to a decreased environmental footprint for industry. Presently, it is very difficult and expensive to collect the data (phenotypes) required for genetic improvement of feed efficiency and methane emissions. To date, there has been no large-scale direct selection for these traits in breeding dairy cattle. The latest genomic approaches offer an opportunity to address this, but accurate phenotypes are required for genetically-representative animals from the Canadian population. Therefore, this project focuses on using genomics to collect important data for calculating individuals' genetic merit. Industry breeding strategies can then incorporate these two traits in developing optimal populations, even for young animals without phenotypes. Involvement of international research partners and industry networks ensures standardization of necessary new data, and broad application of project outputs for the benefit of Canada's dairy industry and global food security and sustainability.
Impact There are no direct impacts from this work to date. The project is multidisciplinary involving livestock genetics researchers, bioinformaticians, ruminant nutritionists and socio-economists
Start Year 2015