Improving livestock production through high-throughput identification of functional regulatory variation

Lead Research Organisation: University of Edinburgh

Department Name: The Roslin Institute

Abstract

In the past several decades there has been a substantial global investment to try and map the regions of livestock genomes that control production, disease tolerance and welfare phenotypes. The ultimate aim of mapping these DNA regions is so that researchers can then use advanced genomics and breeding approaches to more rapidly improve the production and welfare of livestock. Although many important loci have been mapped, most often we do not know which precise genetic changes in these regions are linked to the observed differences in phenotypes, making it more difficult to apply advanced approaches such as gene editing to improve these traits. However, studies in cattle have estimated that variants that alter downstream phenotypes are over 18 times more likely to do so by leading to changes in transcription, i.e. the expression level of genes, than is expected by chance (Nat Genet 50, 362-367 (2018)). If we can map which variants directly impact expression levels, we can determine which genetic changes are most likely driving the observed changes in key traits. This will consequently substantially improve the rate at which we can improve important livestock phenotypes.

In this project we will apply a high-throughput approach that directly tests the impact on gene expression of millions of genetic changes at the same time. This will allow us to generate a catalogue of cattle functional variants directly linked to changes in transcription, and which may therefore underlie loci linked to important traits. However, we will also take this further, and test the impact of human genetic changes when in cattle cells as well as vice versa. Certain species are much better annotated with richer datasets than others, and we will use these data to determine which features are linked to genetic variants that impact gene regulation across species. Using these data and machine learning approaches we will develop statistical models that will allow researchers to predict which genetic changes will likely have an impact across species. This will allow researchers to exploit the data in better characterised species to improve less well annotated ones, further accelerating livestock improvement efforts but also potentially, for example, informing human disease studies that are based on animal models.

Consequently, this project is expected to substantially improve the understanding of both cattle and human phenotypes by mapping regulatory variants and developing statistical models for predicting variants that impact transcription across species.

Technical Summary

Livestock research benefits from the fact that if functional variants can be identified they can be readily acted upon via breeding and genome editing. Modern massively parallel reporter assays (MPRA) have the potential to bridge the current substantial gaps between mapping genomic loci and identifying actionable variants. By cataloguing the impact of genetic variants on transcription levels on a genome-wide scale, it is possible to identify variants with a direct impact on transcription, something generally not possible with traditional eQTL studies. In this project we propose to apply the SuRE MPRA approach to cattle for the first time, to assess the potential impact of millions of variants on transcription and identify thousands of regulatory variants potentially driving downstream phenotypes.

Recent studies have shown that there is not only a considerable overlap in loci linked to the same traits across species, but that gene regulation is well conserved across mammals, to the extent that models for predicting distal regulatory elements work well across species. This suggests findings from one species can potentially be lifted over to another. To investigate the potential of this we will test which human genetic variants impact transcription in cattle cells, as well as vice versa, and see which variants have conserved impacts across species. Using these data alongside relevant annotations, we will develop machine learning models for predicting which genetic changes effect transcription and have conserved impacts on gene expression across mammals. This will enable the statistical prediction of functional variants and the potential lifting over and exploitation of findings across species. We will validate these models and predictions using CRISPR/Cas9 editing through introducing human variants predicted to have conserved impacts across species into cattle cells and assessing their impacts on transcription.

Funded Value:

£607,164

Funded Period:

Mar 22 - Feb 25

Funder:

BBSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

BB/W000288/1

Principal Investigator:

James Prendergast

Research Subject:

Animal science (42%)

Genetics & development (56%)

Research Topic:

Animal diseases (14%)

Animal welfare (14%)

Gene action & regulation (56%)

Livestock production (14%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
James Prendergast (Principal Investigator)
Liam Morrison (Co-Investigator)	http://orcid.org/0000-0002-8304-9066
Musa Hassan (Co-Investigator)	http://orcid.org/0000-0002-0371-3300
Tim Connelley (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Zhao R (2022) The conservation of human functional variants and their effects across livestock species. in Communications biology

Key Findings


Description	We have successfully generated the first genome-wide massively parallel reporter assay (MPRA) dataset in cattle. This dataset covers both sub-species (bos indicus and bos taurus) and has matching human data with all three MPRA libraries tested across human and cattle cells. This unique resource has allowed us to not only define cattle regulatory variants at base pair resolution, but also identify genetic variants whose effects depend on their cellular environment. We are using these data alongside other omics data with bioinformatics/machine learning approaches to gain new insights into the genetics of gene regulation in cattle and how it has evolved across mammals.
Exploitation Route	We have already demonstrated these data can be used to identify functional variants underlying important phenotypes and we expect us and others can use the data to improve the estimation of genomic estimated breeding values (gEBVs) to improve cattle production and health phenotypes.
Sectors	Agriculture Food and Drink

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications