Improving livestock production through high-throughput identification of functional regulatory variation

Lead Research Organisation: University of Edinburgh
Department Name: The Roslin Institute

Abstract

In the past several decades there has been a substantial global investment to try and map the regions of livestock genomes that control production, disease tolerance and welfare phenotypes. The ultimate aim of mapping these DNA regions is so that researchers can then use advanced genomics and breeding approaches to more rapidly improve the production and welfare of livestock. Although many important loci have been mapped, most often we do not know which precise genetic changes in these regions are linked to the observed differences in phenotypes, making it more difficult to apply advanced approaches such as gene editing to improve these traits. However, studies in cattle have estimated that variants that alter downstream phenotypes are over 18 times more likely to do so by leading to changes in transcription, i.e. the expression level of genes, than is expected by chance (Nat Genet 50, 362-367 (2018)). If we can map which variants directly impact expression levels, we can determine which genetic changes are most likely driving the observed changes in key traits. This will consequently substantially improve the rate at which we can improve important livestock phenotypes.

In this project we will apply a high-throughput approach that directly tests the impact on gene expression of millions of genetic changes at the same time. This will allow us to generate a catalogue of cattle functional variants directly linked to changes in transcription, and which may therefore underlie loci linked to important traits. However, we will also take this further, and test the impact of human genetic changes when in cattle cells as well as vice versa. Certain species are much better annotated with richer datasets than others, and we will use these data to determine which features are linked to genetic variants that impact gene regulation across species. Using these data and machine learning approaches we will develop statistical models that will allow researchers to predict which genetic changes will likely have an impact across species. This will allow researchers to exploit the data in better characterised species to improve less well annotated ones, further accelerating livestock improvement efforts but also potentially, for example, informing human disease studies that are based on animal models.

Consequently, this project is expected to substantially improve the understanding of both cattle and human phenotypes by mapping regulatory variants and developing statistical models for predicting variants that impact transcription across species.

Technical Summary

Livestock research benefits from the fact that if functional variants can be identified they can be readily acted upon via breeding and genome editing. Modern massively parallel reporter assays (MPRA) have the potential to bridge the current substantial gaps between mapping genomic loci and identifying actionable variants. By cataloguing the impact of genetic variants on transcription levels on a genome-wide scale, it is possible to identify variants with a direct impact on transcription, something generally not possible with traditional eQTL studies. In this project we propose to apply the SuRE MPRA approach to cattle for the first time, to assess the potential impact of millions of variants on transcription and identify thousands of regulatory variants potentially driving downstream phenotypes.

Recent studies have shown that there is not only a considerable overlap in loci linked to the same traits across species, but that gene regulation is well conserved across mammals, to the extent that models for predicting distal regulatory elements work well across species. This suggests findings from one species can potentially be lifted over to another. To investigate the potential of this we will test which human genetic variants impact transcription in cattle cells, as well as vice versa, and see which variants have conserved impacts across species. Using these data alongside relevant annotations, we will develop machine learning models for predicting which genetic changes effect transcription and have conserved impacts on gene expression across mammals. This will enable the statistical prediction of functional variants and the potential lifting over and exploitation of findings across species. We will validate these models and predictions using CRISPR/Cas9 editing through introducing human variants predicted to have conserved impacts across species into cattle cells and assessing their impacts on transcription.

Publications

10 25 50