US-UK BBSRC-NIFA Collab-Reassembly of cattle immune gene clusters for quantitative analysis

Lead Research Organisation: The Pirbright Institute
Department Name: Immunogenetics

Abstract

Since livestock were first domesticated approximately 10,000 years ago they have been selectively bred for desirable traits. Traditional genetic improvement using measurable traits and animal pedigrees has been very successful, particularly to increase production in important livestock species. The result today is a plethora of different breeds that are particularly suited for different environments or types of production, e.g. dairy and beef cattle.

However within most livestock populations there is considerable amount of variation that has never been exploited during selective breeding. As the global demand for food increases rapidly, the demand for livestock improvement is escalating. As a consequence of this demand and recent advances in technology, it is now possible to inform breeding strategies based on the animal's genome sequence. In cattle this has been made possible through the characterisation of nearly 800,000 single nucleotide polymorphisms (SNPs) identified in cattle genomes. Rather than having to sequence the whole genome of each animal, rapid identification of SNPs that are associated with parentage, productive traits or breed composition allows for breeding decisions to be made earlier in an animal's life. Unknown animals with no phenotypic data can then be assessed solely on their SNP genotype and their breeding values calculated. This method of genomic selection is now widely used by cattle breeding companies.

However, as with any young technology, problems remain. If regions of the genome are very variable between individuals and/or very repetitive it is difficult to identify SNPs that can be screened by the genotyping technology. There are several highly variable and repetitive immune gene complexes in mammalian genomes which have a fundamental role in disease resistance and responses to vaccines. Moreover these regions have evolved this complexity, at least in part, to combat rapidly evolving pathogens. In cattle, we have identified that the current SNPs do not cover three large and vital immune gene complexes, and to a large extent these complexes have not been assembled in the current genome builds. The validation of SNPs for use in genotyping relies upon an accurate genome assembly over the region the SNP is located; therefore this further compounds the problem. Ultimately the current technology is not yet able to type for genetic markers associated with important immune genes that are likely to influence health and disease resistance traits.

Cattle possess a pool of natural genetic diversity that has evolved to counter rapidly evolving pathogens that cannot yet be selected for using genomics. We propose to develop the tools to utilise this diversity to improve health and disease resistance traits in cattle. Building on our initial assemblies of these gene complexes, we will assemble these genomic regions in many individuals to characterise the extent a large structural variation. Existing short whole genome sequence reads from > 30 individuals will then be aligned to these larger regions, alongside other publically available sequence datasets. By targeting these regions, it will be possible to identify and validate suitable SNPs, even those at low frequency, which will then be incorporated into a genotyping platform. The utility of this tool will then be tested by genotyping a herd of cattle that display differential disease resistance to bovine tuberculosis, a complex disease that is known to involve a genetic component and is influenced by the gene complexes we are targeting in this study. Ultimately we envisage that these markers can then be incorporated into current and future genotyping technologies to improve disease resistance in cattle through selective breeding.

Technical Summary

This project will target and sequence three highly variable immune gene clusters in cattle, the major histocompatibility complex (MHC), the leukocyte receptor complex (LRC) and the natural killer complex (NKC). The highly polymorphic molecules encoded in these regions control and influence a diverse array of fundamental immune functions. However, these regions remain largely absent or misassembled in the current builds of the cattle genome, and very little information regarding polymorphism has been gathered. This in turn has impacted the SNP density on the current single nucleotide polymorphism (SNP) typing platforms that are being used to genotype cattle to improve production and health traits, as well as performing genome wide association studies. Consequently, it is not possible to determine if these fundamental immune genes are associated with any health or disease traits in cattle, including responses to vaccines.

Using PacBio (SMRT) sequencing of BAC and fosmid clones we aim to de novo assemble these regions in five dairy cattle. Existing whole genome data from 40 Holstein and 20 Angus bulls can then mapped with confidence to these regions, revealing the position and pattern of polymorphisms. This will enable the development of a bespoke SNP typing platform which for the first time will allow the genotypes over these regions to be determined and examined alongside the SNP genotypes from the rest of the genome. Finally, as a proof of principle experiment, we will test this SNP platform on approximately 1200 cattle that are differentially susceptible to bovine tuberculosis. This experimental cohort was selected due to a priori knowledge of the involvement of genes within the immune complexes being studied with this complex disease. Additionally, this cohort has already been genotyped using the most advanced current SNP typing platform allowing us to combine these datasets.

Planned Impact

Single nucleotide polymorphism genotyping is both the most contemporary and the most cost effective method used to improve desirable traits through genomic selection. However, important regions of the genome that encode fundamental genes of the immune system are not currently interrogated by these platforms. By targeting these gene complexes this research will enhance the ability of SNP genotyping to examine immune and health traits. We will also test the utility of these new genetic markers by studying their involvement in resistance to bovine tuberculosis. This disease is a massive economic burden to the UK, and the immune response to Mycobacterium bovis is known to involve genes encoded for in complexes targeted in this proposal. Therefore, as a proof of principle experiment, resistance to bovine tuberculosis offers an excellent opportunity to highlight the role of these immune complexes, add to our epidemiological understanding of this disease and maximise the impact of this research.
The impact of the immediate outputs of this research is broad. The initial genomic assemblies will include novel regions that remain unassembled in the current cattle genome builds. As we are targeting one of the two related animals that are the subjects of the cattle genome assembly, the impact of providing and annotating regions that are not yet assembled, and correcting regions that contain misassembly will be of benefit to the entire global community using this pivotal reference genome. Moreover, characterising the variation within these regions and including this alongside the reference genome will ensure that this important diversity can be exploited by all potential parties. This will be of significant value for comparative genomics researchers as well as the cattle breeding and animal research community. The improved genome structure will allow for better cross-species comparisons and tracking of evolutionary changes in ruminants. It will allow for the tracking of SNP associated with health and fitness traits in the animal breeding communities. Agri-genomic SNP genotyping companies will benefit from the identification of new useful markers that will be deployed to improve health and fitness traits in cattle.
Therefore the potential impact is very high for the cattle industry. In addition, any identification of bTB associated markers will allow for the creation of low-cost genotyping chips that could identify highly susceptible animals in a herd. The extension to other diseases would be straightforward. The financial savings of such a strategy are enormous.

This research will produce a highly skilled cross disciplinary researcher. Producing such skilled researchers with expertise in livestock immunology and bioinformatics is of significant benefit to the UK academic and non-academic communities. The significant level of collaboration with researchers in the US will make this an outstanding opportunity for a PDRA.
Ultimately, this research will provide information for farmers and breeders to help make decisions on herd management and breeding. This could have enormous benefit by reducing the cost of disease management and increasing sustainability. Any advancement in understanding disease resistance will be of benefit by improving productivity and hence wealth creation. As part of improving food security this research will have a beneficial impact on UK society in general and ultimately the rest of the world. Any effect on reducing the burden of disease will have a major beneficial effect on social welfare, wealth creation through development of livestock industries and the removal of barriers to trade. As such, this project directly addresses BBSRC strategic priority areas in Food Security and therefore contributes to meeting its targets. This project also facilitates data sharing within the animal genetics community, and several other bioscience areas, to facilitate global research within the food security agenda.
 
Description We have developed protocols to genome enrich and assemble large regions of the genome containing immune genes using contemporary methodologies and bioinformatics. Using in house designed probes we were able to enrich over 2.5 mb of the cattle genome from 25 animals that are highly variable between individuals and contain many genes that are fundamental to the immune response.
This data was de novo assembled and each one used as a reference sequence to undertake SNP discovery within a cohort of 175 dairy bulls that were sequenced at very high resolution by our USDA partners. This led to the identification of over 7000 new SNPs that have the potential to be markers for differential immune function in cattle. We took a small subset of these markers based on location and frequency and produced a physical SNP platform to perform a genome wide association study with a cohort of 4000 cattle that were differentially susceptible to bovine TB, that had already been genotyped using the high-density Illumina SNP chip.
This limited number of new SNPs were very successful with over 75 % segregating within the population and providing meaningful data. We are now applying a machine learning approach to these SNPs to enable better prediction of the next round of SNPs from the panel of 7000. We will then preform another GWAS study. We are also imputing these SNPs onto over 1 million genotyped cattle held by the USDA that have many phenotypes.
We plan to refine our SNPs over the coming year and apply for future funding.
We have met our original objectives but due to delays with the USDA partners we are yet to finish. However, the award is still active in the US.
Exploitation Route We hope this will lead to more accurate ways to measure immune responses in livestock, and inform the development of physical tools to look at genetic variation in cattle populations that underpins differential outcomes to infection with pathogens. We are already talking to several international collaborators about using these SNPs to study historical GWAS datasets
Sectors Agriculture, Food and Drink

 
Description A new protocol developed by Roche has been used for the workflow of targeted enrichment with subsequent long-read sequencing technology (PacBio). This protocol initially did not perform well or at all on cattle DNA. Subsequently we have worked together with the Roche Nimblegen developer team to adapt the protocol to work with cattle DNA, which required some substantial changes to the protocol. A method paper is planned to make this adapted workflow for cattle DNA available for the wider research community. . Furthermore the data from de novo assembled contigs and haplotypes helped to select putative SNPs for immunogene clusters which are currently being tested and validated for future use in the breeding industry. This technical development has led to industry funding to pursue more efficient methods based on this first phase of work. PacBio approached us and developed a collaboration with IDT to use CRISPR as a probe to enrich specific DNA fragments. Both companies contributed money in kind to translate our protocol using these new advances, which was successful but did not provide the throughput advances required to be commercially viable
First Year Of Impact 2018
Sector Agriculture, Food and Drink
 
Description BBSRC GCRF Databases and Resources
Amount £1,400,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 10/2016 
End 06/2018
 
Description BBSRC GCRF Databases and Resources
Amount £730,000 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2017 
End 08/2019
 
Description Pirbright Agri-Food Technology Seed Fund
Amount £23,568 (GBP)
Organisation The Pirbright Institute 
Sector Academic/University
Country United Kingdom
Start 01/2018 
End 01/2019
 
Title Cattle MHC genotyping 
Description Using the sequence data generated through the targeted pull down of MHC, we developed a full gene and more targeted PCR approach to genotype cattle for the MHC class I region. This has been applied to many hundreds of samples to enable us to select individuals for breeding as well as survey genetic diversity in beef and diary herds. 
Type Of Material Technology assay or reagent 
Year Produced 2018 
Provided To Others? No  
Impact After publication which we anticipate in 2019, we will appy this method to targeted herds and are already attracting industry interest. 
 
Title Genome enrichement with long read sequencing 
Description We have developed the Nimblegen-Roche SeqCapEZ probes based pull down method to work with genomic DNA fragments up to 6kb in length with PacBio sequencing. This is being used for de novo assembly of variable regions 
Type Of Material Technology assay or reagent 
Year Produced 2018 
Provided To Others? No  
Impact We are now involved with industry to refine this protocol further with less input DNA and high multiplex capability. 
 
Title UniMMap- a pipeline for mapping RNAseq data over repetitive immune complexes. 
Description To exploit the abundance of available short read sequencing data we have developed a pipeline that uses mappability to accurately measure transcription over repetitive gene complexes. This methods uses known haplotypes to examine regions of uniqueness, and then the RNAseq data from the individual to train the method to be species or individual specific. This is particularly important over gene complexes that contain genes involved in the immune system, that are often highly similar in sequence but can have profoundly different functions. 
Type Of Material Technology assay or reagent 
Year Produced 2018 
Provided To Others? No  
Impact the immediate impact will be adding fine resolution data to livestock gene expression atlas projects lead by the FAANG consortium. Further impact will be arising as this method is published and we apply it to numerous existing datasets the are publically avaialable as well as generated at Pirbright and by our collaborators. 
 
Description Bridget Penman 
Organisation University of Warwick
Department School of Life Sciences
Country United Kingdom 
Sector Academic/University 
PI Contribution We are providing genetic data and diversity measures of MHC and associated receptors to inform the modelling efforts to elucidate which selection pressures are the main drivers for the types of diversity we see in extant species.
Collaborator Contribution They are experts in mathmatical modelling and will provide data to help explain genetic diversity in cattle.
Impact Not yet
Start Year 2017
 
Description DH - Investigation of immune gene diversity in African Cattle breeds. 
Organisation International Livestock Research Institute (ILRI)
Country Kenya 
Sector Charity/Non Profit 
PI Contribution Providing access to the data on diversity.
Collaborator Contribution Sharing and sourcing of gDNA samples of African cattle breeds to investigate immune gene diversity.
Impact Organisation of a mini symposium at ILRI, Kenya on Immunogenetics in Cattle.
Start Year 2017
 
Description DH - Investigation of immune gene diversity in African Cattle breeds. 
Organisation University of Edinburgh
Department The Roslin Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Providing access to the data on diversity.
Collaborator Contribution Sharing and sourcing of gDNA samples of African cattle breeds to investigate immune gene diversity.
Impact Organisation of a mini symposium at ILRI, Kenya on Immunogenetics in Cattle.
Start Year 2017
 
Description DH - Postdoctoral meeting between Pirbright and University of Surrey Postdocs 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact To organise a working network between Pirbright and University of Surrey postdoctoral students.
Year(s) Of Engagement Activity 2018
 
Description DH - Presentation "Haplotype reconstruction of bovine immune gene clusters using genome enrichment and PacBio sequencing": Immunogenetics workshop at ILRI, Kenya 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact 40 researchers/scientist attended a whole day workshop where individual groups were presenting on work concerning immunogenetics to stimulate and explore opportunities for collaboration between ILRI and Pirbright.
Year(s) Of Engagement Activity 2017
 
Description DH - Presentation "Haplotype resolution of leukocyte receptor complex in cattle through targeted enrichment and SMRT sequencing" at ISAG conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented our current research on Haplotype resolution of leukocyte receptor complex in cattle through targeted enrichment and SMRT sequencing.
Year(s) Of Engagement Activity 2010,2017
 
Description DH - Presentation "KIR haplotype discovery through targeted enrichment and SMRT sequencing in cattle" KIR workshop in Cambridge, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented our current research on Haplotype reconstruction of bovine immune gene clusters using targeted enrichment and PacBio sequencing.
Year(s) Of Engagement Activity 2017
 
Description DH - Presentation "Re-sequencing cattle IGCs with PacBio" Internal NGS (Next Generation Sequencing) discussion group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presented an introduction to our current research with focus on the analysis pipeline.
Year(s) Of Engagement Activity 2017
 
Description Dorothea Harrison - Cheltenham Science Festival 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact To make science understandable to non-scientists
Year(s) Of Engagement Activity 2016
URL https://www.pirbright.ac.uk/events/times-cheltenham-science-festival
 
Description Dorothea Harrison - Immunogenetics stand at Open Day of the Pirbright Institute 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Showcase to our colleagues and key stakeholders the work the Pirbright Institute does.Promote the transformation plan at the Institute.
Convey and discuss current Livestock Viral Diseases Programme research to peers at The Pirbright Institute as well as external visitors.
Year(s) Of Engagement Activity 2015
 
Description Dorothea Harrison - UK Veterinary Vaccinology Bioinformatics Workshop at Roslin Insitute, Edinburgh 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The aim of this workshop was to understand what is currently being worked on, what resources are currently available, highlight gaps in current knowledge and resources, and ultimately identify future research priorities. Dr John Hammond, The Pirbright Institute opened the workshop by discussing the potential of functional genomics to improve animal health, setting the scene for the day. The workshop identified several aims including identifying pipelines and sharing methods within bioinformatics applicable to comparative and veterinary immune research. Applications of studying bioinformatics in veterinary vaccinology were highlighted:
•response to vaccination v natural infection
•tissue or cell specific interactions
•markers or correlates of protection
•splice variation

Dr. Hammond's presentation led on to discussing the importance of characterising variable immune gene complexes including repetitive immune gene complexes, highlighting research on KIR and MHC genes. Dr Hammond emphasized that at present the human genome is used as a gold standard for research and there remains many gaps in livestock genome assembly and annotation. Of greatest scrutiny and target of improvement are the often variable genes associated with reproduction and the immune system. Further research should consider the quality of reference genomes used.
Year(s) Of Engagement Activity 2015
URL http://www.vetvaccnet.ac.uk/news/2015/11/veterinary-vaccinology-network-bioinformatics-workshop
 
Description ILRI Immunogenetics Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Organised an Immunogenetics workshop with multiple UK organsiations at ILRI-Kenya
Year(s) Of Engagement Activity 2017
 
Description ISAG 2017 - Comparison of the cattle leukocyte receptor complex with related livestock species 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2017
 
Description Immunogenetics Workshop at ILRI, Nairobi - The evolution and diversity of NK cell receptors in livestock species 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2017
 
Description Internal Seminar - The evolution and diversity of natural killer cell receptors in bovids. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presented our current research and received audience feedback
Year(s) Of Engagement Activity 2018
 
Description Internal seminar - Use of long-read sequencing to resolve complex genomic regions 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2017
 
Description KIR 2017 Workshop - Comparison of the cattle leukocyte receptor complex with related livestock species 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2017
 
Description Oral presentation - 2019 PAG meeting - Contiguity of the Goat Genome Assembly Enables Comparative Analyses and Gene Discovery within the Repetitive Immune-Related Regions. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Discussed our current research and received audience feedback
Year(s) Of Engagement Activity 2019
 
Description Oral presentation - KIR meeting 2018 - Goat and sheep KIR have independently expanded compared to cattle and a unique subtype is unusually expressed 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Presented current research and received audience feedback
Year(s) Of Engagement Activity 2018
 
Description Outreach at Fox Corner Community Wildlife Area 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Local community outreach event in which we volunteered to clear invasive plant species
Year(s) Of Engagement Activity 2017
 
Description PacBio Leiden 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Presented data to academics and industry experts on the methods we have developed to enrich and sequence targeted areas of genomes
Year(s) Of Engagement Activity 2018
 
Description Public Open Day at Diamond Light Source 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Staffed The Pirbright Institute's booth at Diamond Light Source's Public Open Day in July 2017. Described our current research to the general public.
Year(s) Of Engagement Activity 2017
 
Description conference seminar - cattle NKC at 2015 KIR Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2015
 
Description internal seminar - NKC evolution 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Conveyed and discussed our current research to peers at The Pirbright Institute.
Year(s) Of Engagement Activity 2016
 
Description internal seminar - sequencing immune gene complexes 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact Presented an introduction to our current research
Year(s) Of Engagement Activity 2017
 
Description invited seminar - NKC and LRC co-evolution - University of Minnesota 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Discussed our current research
Year(s) Of Engagement Activity 2016
 
Description invited seminar - NKC evolution - UFMG, Brazil 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Discussed current research at UFMG Vet School in Belo Horizonte, Brazil
Year(s) Of Engagement Activity 2016
 
Description invited seminar - sequencing immune gene complexes - University of Minnesota 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Discussed current work using new reference genomes to characterise immune gene complexes
Year(s) Of Engagement Activity 2016