GCRF-BBR: A compendium of structural variation across African cattle breeds

Lead Research Organisation: University of Edinburgh
Department Name: The Roslin Institute


Cattle are a vital component of the economies of low and middle income countries across the globe. They are a source of meat and dairy products, provide leather and other by-products and are often used as working animals. Cattle can be grazed where crops are not easily grown, provide manure as effective fertiliser and can convert forage into high-protein food. Often cattle are the most valuable possession individuals in low and middle income countries own, but are susceptible to a range of infectious diseases, which not only inflict a heavy economic burden on these countries but can also often be transmissible to humans.

Over two-thirds of the global cattle population are in low and middle income countries. Reducing the burden of cattle infectious diseases would consequently have major benefits to these areas. However for many diseases no vaccine or treatment is available. Where they are available, poor veterinary services generally mean the livestock keepers must pay for vaccines and treatments themselves. Consequently these diseases are major barriers to escaping poverty.

Due to the co-evolution of pathogens and cattle there are various examples of cattle breeds that show natural tolerance or resistance to important infectious diseases. Although these breeds are often less productive, making them less attractive to LMIC farmers, if the mechanisms underlying their resistance could be identified and harnessed, then breeds of both higher productivity and tolerance could be developed. Likewise certain breeds have been shown to be better adapted to particular environmental conditions. With certain breeds better able to tolerate extreme temperatures and limited access to feed and water for example.

Structural variants (SVs), large alterations in an animal's genome sequence, have been linked in livestock to adaptation to environments and natural tolerance to important diseases. Despite their potential value, little is currently known about the genome-wide location of SVs in cattle and almost nothing in African cattle breeds. Native African cattle are particularly well adapted to African environments and pathogens, with structural variants expected to underlie many such important traits. Knowledge of the location of structural variants in African cattle breeds will enable researchers to determine their functional consequences and develop better breeding programs. In this project we will develop a database of structural variants focused on African cattle breeds to enable studies of their role in shaping important phenotypes.

Technical Summary

There is a range of cattle breeds in Africa that have co-existed with the local environment and pathogens for thousands of years, leading to adaptation and tolerance to local diseases, environments and resources. Notably some cattle breeds tolerate infection with pathogens that cause significant disease in others, while other breeds are drought or temperature tolerant. The exploitation of this valuable resource of natural tolerance has been lacking, but efforts are beginning to characterise the genetic diversity of cattle across these regions. To date the focus has been on using SNPs and short indels to map the locations of genomic loci linked to economically important phenotypes. This is in part due to the difficulty of determining the location of larger alterations from sequencing technologies, in particular in species such as the cow where the reference genome remains incomplete. However, the unique evolutionary history of the cow and current knowledge of the role of SVs in livestock phenotypes suggests larger genomic changes potentially underlie many of these important adaptation events. In this pilot study we propose to address this lack of knowledge on African cattle SVs and start compiling and making available a database of SVs across African breeds. This will enable subsequent studies of the roles of these SVs in important phenotypes, so that they can then be exploited in downstream breeding programs. Due to the relatively poor cow reference genome and limitations of technologies such as array CGH and high-throughput sequencing we propose to use optical mapping to characterise the spectrum of SV across important African breeds. We will then set up a browser interface for users to query and view SVs across breeds at different genomic locations to facilitate the mapping of SVs to phenotypes. The outputs of this project will significantly increase our knowledge base on LMIC cattle diversity and accelerate the potential of exploiting this valuable resource.

Planned Impact

This pilot work and portal will enable downstream studies of the role of structural variants in shaping disease tolerance and environmental adaptation in African cattle. We therefore envisage a range of both short term and long term beneficiaries of this program of research.

- Enabling and interpreting genetic association studies

Primary beneficiaries in the short term will be academics investigating the resistance/tolerance of cattle infectious diseases and adaptation to the African environment. A public database of structural variation will both allow for SVs to be tested against key phenotypes but also inform the interpretation of genetic association and population genetic study results by providing candidate functional variants in relevant regions.

- Reduced burden of infectious diseases

The primary long term target beneficiaries of this work are livestock holders in low and middle income countries through enabling the identification and exploitation of genomic regions linked to disease tolerance and environmental adaptation. The annual cost of treating cattle with an acaricide for example (to reduce tick-borne diseases) has been estimated at $6-$36 per animal, with treatment of an infected animal costing approximately $38. The exploitation of functional SVs and development of naturally resistant and productive cattle breeds has the promise of dramatic economic benefits to LMIC farmers and in particular the rural poor.

- Environmental benefits

Current use of insecticides and acaricides have significant impacts on the environment and soil fertility. Reducing their use through understanding alternative mechanisms of reducing disease burden could therefore have substantial longer term environmental benefits. Furthermore exploiting SVs linked to drought tolerance and environmental adaptation could enable the development of less resource intensive, but productive breeds.

- Informing array development and improved genotyping

We expect this work to feed into the ongoing development of African-targeted bovine genotyping arrays/assays that would dramatically improve the ability to undertake GWAS for cattle traits on the continent. Importantly the identification of sites of SVs in LMIC cattle breeds is also expected to substantially improve genotype calling from sequencing data. Unknown structural variation is one of the largest sources of false positive variant calls.

- Training of LMIC researchers

In conjunction with BecA (Biosciences Eastern and Central Africa) we will use this resource in our ongoing training courses for African students and scientists. This will provide the resource for training programs targeted towards understanding the genetic adaptation of African cattle to pathogens and environment and the mechanisms underlying the poor performance of divergent European breeds in an African setting.

- Annotation of Indicine and African Taurine genomes

A major limitation of the current Hereford reference genome is not only its incomplete status but also the extent to which it represents African cattle breeds is unclear. These SVs will be used to inform accurate annotation of a new African taurine reference genome being generated as part of a separate GCRF funded project. This will provide an appropriately tailored and accurate resource for researchers working on these cattle sub-species and will maximise the impact and use of this genomic information. Importantly this resource is also likely to identify issues with the current reference assembly given its incomplete nature and the fact it is likely to contain erroneous rearrangements. Going forward we also expect this dataset to contribute to the generation of a genome graph representation of the cow genome.
Description We have successfully generated the bionano data across the nine cattle breeds that has allowed us to identify the extent of structural variation in the cattle genome. We have identified large numbers of structural variants existing between these breeds, which are not captured by the current reference sequence. Using these data we have generated a new cattle graph genome that represents the diversity of global cattle better than the currently used genome from a single Hereford cow.
Exploitation Route We will be uploading the data and results from this grant to our BOmA browser https://www.bomabrowser.com/waterbuffalo/ once finished meaning all data will be freely accessible and browsable by others. Likewise the graph genome within which it is integrated will be made freely accesible from the same site.
Sectors Agriculture

Food and Drink

Description Beyond a single reference: Building high quality graph genomes capturing global diversity
Amount £436,526 (GBP)
Funding ID BB/T019468/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2020 
End 01/2023
Title Bionano optical mapping data across diverse cattle breeds 
Description We have generated long read optical mapping data for 18 animals across nine diverse global cattle breeds. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact The bionano optical mapping data has already been used to scaffold a new high quality Ankole genome and will be used to generate high quality assemblies of further cattle breeds. The data has also been used to investigate structural variation (SV) across global breeds highlighting the extensive level of SV in Bos indicus cattle lineages. 
URL https://zenodo.org/record/6516993#.Y_9kk3bP1PY
Title Boran genome assembly 
Description We have generated a Boran HiFi assembly scaffolded with Bionano optical mapping data 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? No  
Impact We are using this to investigate the genetic basis of heritable tolerance to East Coast fever observed among this animal's pedigree. 
Title Cattle genome assemblies for Ankole and NDama breeds 
Description High quality reference genome assemblies for two cattle breeds generated from PacBio and Illumina sequencing data and bionano optical mapping data. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact The data has been submitted for publication 
URL https://www.bomabrowser.com/cattle.html
Title XP-EHH and XP-CLR scores across water buffalo and cattle breeds 
Description This dataset consists of XP-EHH and XP-CLR scores calculated using the hapbin (https://github.com/evotools/hapbin) and xpclr (https://github.com/hardingnj/xpclr) software with default parameters across 79 water buffalo and 294 cattle whole genome sequences. These data are also available in our BoMA browser (https://www.bomabrowser.com/index.html) and were used in this publication: Dutta, P., Talenti, A., Young, R. et al. Whole genome analysis of water buffalo and global cattle breeds highlights convergent signatures of domestication. Nat Commun 11, 4739 (2020). https://doi.org/10.1038/s41467-020-18550-1 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Was used in this publication Dutta, P., Talenti, A., Young, R. et al. Whole genome analysis of water buffalo and global cattle breeds highlights convergent signatures of domestication. Nat Commun 11, 4739 (2020). https://doi.org/10.1038/s41467-020-18550-1 
URL https://datashare.ed.ac.uk/handle/10283/3665
Description Collaboration with partners in Brazil 
Organisation Universidade de São Paulo
Country Brazil 
Sector Academic/University 
PI Contribution A team member travelled to Brazil to collect and prepare samples.
Collaborator Contribution Our partners in Brazil arranged all the transport and access to animals as well as lab space and use of technical equipment such as FACS machines.
Impact All data generated for the Nelore breed are a result of this collaboration
Start Year 2018
Description Collaboration with partners in Kenya 
Organisation International Livestock Research Institute (ILRI)
Country Kenya 
Sector Charity/Non Profit 
PI Contribution A member of our team travelled to Nairobi to collect and prepare samples
Collaborator Contribution Our partners arranged transport, veterinary assistance, sample collection, lab space and access to required reagents and equipment such as FACS machines.
Impact The access to African cattle samples has largely been achieved through this partnership
Start Year 2017
Title BOmA (Bovine Omic Atlas) 
Description BOmA is a genome browser tailored for viewing cattle omic data, including that being generated alongside or as part of this award. Data currently on the browser spans both water buffalo and cattle and for example includes genotypes from 420 global cattle breeds and optical mapping, ATAC-seq, RNA-seq and RRBS data for various breeds. The first version of the browser is available here https://www.bomabrowser.com/ and we are currently in the process of updating it to support visualising graph genomes 
Type Of Technology Webtool/Application 
Year Produced 2019 
Impact The browser has already been used to prioritise candidate functional sites, for example, in regions putatively linked to trypanasome and T.parva tolerance. 
URL https://www.bomabrowser.com/
Title evotools/CattleGraphGenomePaper: Code for Talenti et al. A cattle graph genome incorporating global breed diversity. 
Description This release contains the code used for the analyses in Talenti el al. A cattle graph genome incorporating global breed diversity. 
Type Of Technology Software 
Year Produced 2021 
URL https://zenodo.org/record/5749431
Description Water buffalo research covered by several news outlets 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Press releases regarding our water buffalo research (https://www.nature.com/articles/s41467-020-18550-1) were picked up by a number news sources including, for example, the Hindu newspaper with a daily circulation of over 1.4million
Year(s) Of Engagement Activity 2020
URL https://nature.altmetric.com/details/90865165/news