What, where and weather? Integrating open-source taxonomic, spatial and climatologic information into a comprehensive database of livestock infections

Lead Research Organisation: University of Liverpool
Department Name: Institute of Infection and Global Health

Abstract

What are all the species of pathogen that affect our livestock? It is important to answer this question, to help protect the animals that produce our food, and also because nearly 7 out of every 10 human pathogens came from animals, with a good number from the livestock and pets that we closely associate with. Remarkably, however, even for humans this question was only answered ten years ago (1415 were listed) and there remains no definitive answer for livestock, domestic pets and other animals.
This proposal aims to further develop a new database of livestock (and other) pathogens, called EID2 (ENHanCED Infectious Diseases 2). EID2 has been built largely from the tens of millions of records of DNA and RNA sequences that are uploaded onto public databases; where such sequences are from a pathogen, they are frequently uploaded with further information on the host (which animal the pathogen was obtained from), where and when it was obtained, and who by. EID2 takes this information, and draws conclusions; for example, that a certain type of pathogen infects a certain host species, and is/was present in a certain country at a certain time. Similar conclusions can be drawn, and added to EID2, from the tens of millions of publications held in other public databases, thereby covering times and places where sequencing has not been extensive. EID2 can map the pathogens and, using incorporated climate data, it can model the climate conditions that determine their distribution. EID2 is open access.
TRDF funding will enable us to finalise the development of EID2 into a tool and resource for researchers of pathogens of livestock and domestic pets. We will develop the database to hold more spatially detailed information (at county rather than country level) and improve its ability to handle records where the host species is not clearly defined. We will add further environmental data to allow users to produce better models to explain pathogen distributions, and even predict them in the future, given climate change; and we will allow users to work at the level of diseases, rather than individual pathogens or groups of pathogens. Finally, we will give users the ability to add certain information of their own.

Technical Summary

This proposal aims to improve a new open-access database of livestock (and other) pathogens, called EID2 (ENHanCED Infectious Diseases 2). EID2 has been built by bringing together information from tens of millions of records held in the NCBI's publicly-accessible taxonomy, nucleotide and Pubmed databases. The database provides, we believe, the most complete list of animal (including human) pathogens. It is taxonomically structured, meaning it is possible to select information for higher taxonomic levels such as genera, families etc. It is spatial, storing and mapping the countries of presence for thousands of pathogens; and it uses an evolutionary machine learning technique to analyse the climatic conditions associated with pathogen presence.
TRDF funding will enable us to finalise the development of EID2 into a tool and resource for researchers of pathogens of livestock and domestic pets. We will develop the database to hold more spatially detailed information (at county/state/province rather than country level) and improve its ability to extract information from records where the host species is ambiguous. We will add further environmental data to allow users to produce better models to explain pathogen distributions, and predict them in the future, given climate change; and we will allow users to work at the level of diseases, rather than individual pathogens or groups of pathogens. Finally, we will add crowd-sourcing functions that will give users the ability to add certain information of their own, such as nominating new associations (host-pathogen; pathogen-country) or correcting mistaken entries.

Planned Impact

Infectious diseases of livestock are, to varying extents, the business of a large number of non-academic organisations: from government ministries and international organisations concerned with animal health (e.g. OIE) and food security (FAO), to NGOs concerned with development, and charities concerned with natural disasters. Concerning government, the greatest relevance is (in the UK) the Department for Food and Rural Affairs and its agency, AHVLA, but livestock diseases also touch on the Department of Health (Zoonoses), Department for Business, Innovation and Skills (commercial opportunities, economic costs), and the Ministry of Defence (bioterrorism). This broad relevance is demonstrated to some extent by the range of organisations which have commissioned livestock-centred reports from Baylis in recent years: the UK government's Foresight programme (2005), the Health Protection Agency (2010), the World Bank (2011), the US Department of Defense (2011) and the Smith School of Enterprise and the Environment (2011).
We believe government and other organisations can already or will shortly be able to use EID2 for the purposes of horizon scanning (for pathogens near to Europe, or present in specific trading countries, or most sensitive to climate), for information gathering in order to prepare briefings for government (on specific pathogens during an emergency or potential emergence event), or as a research tool for policy development for disease control. Longer term, it may serve a function in terms of disease surveillance (EID stores both time and space information), although its dependence on publications and sequence uploading means it cannot be, and is not intended to be, as responsive as ProMed for example.

Publications

10 25 50
 
Description This grant has led to the creation of a new database called the Enhanced Infectious Diseases database (EID2). EID2 has been used in several studies which have revealed the following:
1. The linkage of humans to other animals through the sharing of pathogens - i.e, do humans share pathogens with animals we are most closely related to (primates), those we eat (livestock) or those we live with (pets)? Our analysis suggests that all three are important.
2. Data in EID2 has underpinned a risk assessment of the sensitivity of diseases to climate and, hence, a measure of how significant climate change may be for future infectious disease transmission. We find that ~60% of diseases (of both humans, and animals) have evidence of climate drivers
3. There is a positive association between the pathogen burden of a domestic animal shared with people (ie, number of species of pathogen known to infect it and humans) and the time since domestication. This provides evidence that domestication is a driver of the spread of pathogens between us and our animals.
Exploitation Route This project aimed to develop EID2 as a resource for us and the wider research community. We have recently received a new TRDF award to develop the database further and promote its use. We will continue to encourage others to use it.
Sectors Agriculture, Food and Drink

 
Description Paper published in Scientific Data (full description of the database) was reported on the website I f**king love science; it received over 10,000 views in a couple of days and was extensively discussed by the readership (of > 24 million people)
First Year Of Impact 2016
Sector Other
Impact Types Societal

 
Description Health. Climate Change Impacts.
Geographic Reach National 
Policy Influence Type Citation in other policy documents
URL http://www.nerc.ac.uk/research/partnerships/lwec/products/report-cards/health/report-card/
 
Title Data and code for: "Monkeypox virus shows potential to infect a diverse range of native animal species across Europe, indicating high risk of becoming endemic in the region." 
Description Background: Monkeypox is a zoonotic virus which persists in animal reservoirs and periodically spills over into humans, causing outbreaks. During the current 2022 outbreak, monkeypox virus has persisted via human-human transmission, across all major continents and for longer than any previous record. This unprecedented spread creates the potential for the virus to 'spillback' into local susceptible animal populations. Persistent transmission amongst such animals raises the prospect of monkeypox virus becoming enzootic in new regions. However, the full and specific range of potential animal hosts and reservoirs of monkeypox remains unknown, especially in newly at-risk non-endemic areas. Methods: Here, our pipeline utilises ensembles of classifiers comprising different class balancing techniques and incorporating instance weights, to identify which animal species are potentially susceptible to monkeypox virus. Subsequently, we generate spatial distribution maps to highlight high-risk geographic areas at high resolution. Findings: We show that the number of potentially susceptible species is currently underestimated by 2.4 to 4.3-fold. We show a high density of susceptible wild hosts in Europe. We provide lists of these species, and highlight high-risk hosts for spillback and potential long-term reservoirs, which may enable monkeypox virus to become endemic. 
Type Of Material Computer model/algorithm 
Year Produced 2022 
Provided To Others? Yes  
Impact N/A 
URL https://figshare.com/articles/software/Blagrove_et_al_2022_poxvriuses_data_and_code/20485332
 
Title Enhanced infectious Diseases Database, EID2 
Description Comprehensive database of infectious agents of animals and humans 
Type Of Material Database/Collection of data 
Year Produced 2012 
Provided To Others? Yes  
Impact papers 
URL http://www.zoonosis.ac.uk/EID2
 
Title The Enhanced Infectious Diseases Database, EID2 
Description In order to provide answers to these questions the EID2 system comprises the following components: 1) Data repositories: EID2 maintains a number of complex data repositories and mapping dictionaries to facilitate interaction discovery and named entity recognition, including: 1) Organisms and their taxonomic lineage relationships (over 1 million organisms to date). 2) Alternative names (e.g. common names, common misspelling, breeds and acronyms), inclusion (AND) and exclusion (NOT) terms for the organisms. 3) Geographical names and hierarchies, including countries, administrative divisions, major cities and natural features. 4) Climate (e.g., temperature and rainfall) and demographic (human and livestock) data for the whole world. 2) Data acquisition layer: EID2 continually retrieves and classifies evidence from two sources: NCBI Nucleotide Sequences database; and PubMed (and soon to include Scopus as a third). Each piece of evidence is then linked to the organisms and geographical location. Sequences are often linked to one "cargo" organism which is either microbe (pathogen) or arthropod vector, one host organism and one location. Publications however are often linked to multiple organisms and locations. One powerful utilisation of EID2 is our ability to quickly extract and filter evidence based on the number of hosts/pathogens/vectors species or locations it mentions. 3) Interactions discovery pipeline: EID2 extracts three types of interactions from its evidence bases: organism-organism interactions, organism-location interactions and organism-organism-location interactions. 4) EID2 Portal: publically accessible at: https://eid2.liverpool.ac.uk/. The portal enables users to browse through EID2 data, lookup interactions for one or more organisms, and produce tailored maps. Papers further describing the resource Wardeh, M., Risely C., McIntyre K.M., Setzkorn C. and Baylis M. 2015. Database of host-pathogen and related species interactions, and their global distribution. Sci. Data. 2:150049. DOI:10.1038/sdata.2015.49. 
Type Of Material Database/Collection of data 
Year Produced 2012 
Provided To Others? Yes  
Impact publications. 
URL https://eid2.liverpool.ac.uk/
 
Description Sapienza University of Rome 
Organisation Sapienza University of Rome
Country Italy 
Sector Academic/University 
PI Contribution Exploring the mammalian virome to detect patterns of compatibility between mammal species and viruses at a global scale, identifying eco-biological profiles of viral carriers along the fast-slow continuum of mammalian life-history.
Collaborator Contribution Provided insight into virus-mammal interactions, and role virus traits have on the transmission/spill-over of viruses.
Impact Publication: Identifying patterns along the fast-slow continuum of mammalian viral carriers (in prep/under review)
Start Year 2022
 
Description Panellist, BBSRC webinar for COP26. Climate change bites; & associated blog post 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact As part of its contribution to COP26, BBSRC organised an online panel event, chaired by BBC's Victoria Gill. I was an invited panellist, taking part in the Q&A. There was also a blogged Q&A
Year(s) Of Engagement Activity 2022
URL https://medium.com/@UKRI/biting-bugs-are-set-to-benefit-from-climate-change-heres-why-that-s-a-probl...
 
Description Public Science event 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact "Big data: snog, marry, avoid" - presentation by Maya Wardeh.
Location: The Vines pub, Liverpool
Year(s) Of Engagement Activity 2016
 
Description School lecture on 'Seven sizes of sickness' - Kingsmead, Calday Grammar and West Kirby GRammar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Talk sparked questions and discussion afterwards

A 6th former from the school has subsequently done 1 week's work experience in my laboratory.
Year(s) Of Engagement Activity 2013
 
Description Speaker on climate change at Wilmslow Guild 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact Talk sparked questions and discussion afterwards

None.
Year(s) Of Engagement Activity 2012