Development of a high-throughput pipeline to identify causal variants and its demonstration in pig muscle

Lead Research Organisation: University of Edinburgh
Department Name: Roslin Institute

Abstract

This project will increase the effectiveness of commercial livestock breeding programmes by developing a method of identifying causal genomic variants, the individual genome elements that control the traits that breeders need to enhance. Traits like muscling, which is the example we use in the project, are controlled by thousands of causal genomic variants, and breeding selections depend on identifying genome regions that contain a preponderance of beneficial causal variants, without identifying individual variants. If breeders had a method of identifying causal genomic variants, their selections would be more accurate and more precise, and in the future they will be able to use genome editing to accelerate improvement while protecting genetic diversity.

Our method will work as a framework of stages to identify causal variants by evaluating information from different sources. The first stage takes historical breeding information and identifies genome regions with millions of variants that have an equal probability of being beneficial to the trait and an equal, but lower, probability of being deleterious. Each subsequent stage of the framework brings in a new source of information and uses it to adjust the two probabilities for each variant. As the stages proceed, a reducing number of variants emerge with an increasing probability of being causal and beneficial for the trait. Early stages of the framework use information that is already available or easy to collect so that the majority of variants can be rejected without passing to stages where the information is expensive to collect.

In the project we propose to develop the framework and integrate and test four stages including gene-editing of muscle cells in culture. In the future, the framework can be expanded to include new sources of information as they come available.

To be successful the project needs to solve three problems:-
1. We need a computational framework to integrate information from different sources and identify putative causal variants.
2. We need to test putative causal variants by gene-editing muscle cells in culture.
3. We need to evaluate the framework in a real breeding program.


The project will develop an "Allele Testing" framework for breeding programmes by integrating:
- Sequence data and phenotypes on 375,000 pigs from a recently concluded project of ours;

- Functional genomic and expression data that is publicly available, or which we have generated in a Roslin funded Pump Priming Project or will collect in this proposed project;

- Data from gene-editing of cultured muscle cells to be collected in the proposed project.


The project has three objectives, as follows:-
1. We will develop a genomics pipeline that integrates; GWAS, expression quantitative trait loci (eQTL) and functional annotation into a ranked list of putative causal variants, using a suite of statistical and bioinformatic methods.

2. We will use gene editing to introduce putative causal genomic variants into a pig in vitro cell system for detection of a cell phenotype.

3. We will validate the "Allele Testing" framework by predicting genomic breeding values for a set of validation pigs, with and without the information on these putative causal genomic variants discovered by the "Allele Testing" framework, followed by comparing the accuracy of both sets of genomic breeding values by correlating them to progeny test records for the validation pigs.

Technical Summary

This project will increase the effectiveness of commercial livestock breeding programmes by developing an "Allele-Testing" framework to identify causal genomic variants that control economically important traits, using pig muscling as an exemplar trait. The framework will act as a multi-stage filter to identify causal variants by integrating information from different sources. As the stages proceed, a reducing number of variants emerge with an increasing probability of being causal and beneficial for the trait. In the project we propose to develop the framework and integrate and test four stages including genome-editing in vitro. In the future, the framework can be expanded to include new sources of information as they come available.

The project will develop an "Allele Testing" framework for breeding programmes by integrating:

- Sequence data and phenotypes on 375,000 pigs from a recently concluded project of ours;

- Functional genomic and expression data that is publicly available, or which we have generated in a Roslin funded Pump Priming Project or will collect in this proposed project;

- Data from gene-editing of cultured muscle cells to be collected in the proposed project.

The project has three objectives, as follows:

1. Develop a genomics pipeline that integrates; GWAS, expression quantitative trait loci (eQTL) and functional annotation into a ranked list of putative causal variants, using a suite of statistical and bioinformatic methods.

2. Use gene editing to introduce putative causal genomic variants into a pig in vitro cell system for detection of a cell phenotype.

3. Validate the "Allele Testing" framework by predicting genomic breeding values for a set of validation pigs, with and without the information on these putative causal genomic variants discovered by the "Allele Testing" framework, followed by comparing the accuracy of both sets of genomic breeding values by correlating them to progeny test records test records.

Planned Impact

The primary goal of this project is to increase the effectiveness of livestock breeding programmes. Our direct links with the pig breeding industry means that the outcomes of this project, if successful, will be immediately translated into practice with a positive economic impact on 25% of the "technified" global pork industry.

There will also be a downstream beneficial impact for the scientific community via the tools and knowledge developed within the project, particularly contributions to the fields of animal breeding, plant genetics, medical science, quantitative genetics, computational genomics, gene editing and cell biology. All of these fields would benefit from tools to discover causal variants for quantitative traits. Finally, the general public and policy makers will benefit from improved efficiency and sustainability of commercial pig production, and the project is likely to have wider ranging impacts on the production of other livestock species.


Animal breeding companies, breed societies, and levy boards: Tools that increase the efficiency of livestock breeding programmes will make it possible to breed better production animals that are healthier and have better welfare. The software and scripts that we will use to generate and analyse data, and the causal genomic variants that we identify in this project, and the editing tools and cell systems will be made available to these organisations.


Users of animal products: The entire chain of users of pig products, including meat packers, processors, retailers and consumers will benefit from Genus and other breeding companies being equipped with tools to increase the effectiveness of commercial livestock breeding programmes. These tools will allow them to deliver a lower cost, higher quality product, that is more environmentally friendly, healthier and suited to individual requirements of stakeholders in the supply chain.


UK Treasury: Will benefit from increased tax revenues through increased profitability of Genus and other UK adopters, the pork supply chain, other UK agricultural users should they adopt downstream products of breeding, and UK based providers of sequencing, genome editing, and cell biology technology.


UK science infrastructure and capacity: The proposed methods and data set will provide a platform for increased R&D capabilities in the UK, maintaining its scientific reputation and associated institutions, with increased capability for enhancing sustainable agricultural production. The proposed research will be embedded within external training courses that the PI is regularly invited to give, and the post-doc working on the project will have the opportunity to be trained at a world-class institute in a cutting-edge area of research while interacting with a leading commercial partner.


Plant genetics, medical genetics and other fields of genetics: Plant genetics (for breeding), medical genetics (to aid drug discovery and personalized medicine) and other fields of genetics such as evolutionary biology (to understand the evolution of natural populations) all would benefit from knowledge of causal variants, the finding of which has been somewhat intractable historically. The Allele Testing framework, if successful, could help each of these fields.


Society and education: All members of society who work to improve or depend upon the competitiveness and sustainability of agriculture will benefit from the downstream practical applications outlined above. The application of the outcomes by breeding organisations will lead to faster and more sustainable genetic progress, leading to healthier food, and food production that is more resource efficient and affordable. Increased efficiencies in agriculture has direct societal benefits in greater food security with less environmental impact. The knowledge will feed into local undergraduate and graduate programs and public engagement programmes at the Easter Bush Outreach Centre.

Publications

10 25 50