Efficient Bayesian modelling of infectious diseases in wildlife

Lead Research Organisation: UNIVERSITY OF EXETER
Department Name: Institute of Biomed & Clinical Science


Infectious diseases of wildlife result in significant welfare and conservation costs to wild animal populations. For example, chytridiomycosis is a fungal pathogen driving the mass extinction of numerous amphibian species worldwide, putting at risk >10% of all vertebrate species; while white-nose syndrome is estimated to have caused >6 million bat deaths in North America by 2012 alone. Diseases in wildlife can also have significant impacts on agriculture. For example, bovine tuberculosis (bTB) is a notifiable disease in livestock, which costs the UK government over £100 million per year in terms of testing and compensation for slaughtered animals, and also has huge impacts on the livelihoods of farmers. The pathogen has a wide host range, which includes protected species such as badgers, and currently, bTB is the subject of highly controversial badger culling trials aimed at attenuating disease transmission to livestock. Wildlife infectious diseases also represent a considerable threat to humans, as emerging zoonoses such as Ebola, Zika, West Nile virus, HIV and plague all attest to. In short, the list of WID outbreaks is long and growing, and we desperately need better tools to study their epidemiology.

Mathematical modelling provides tools that enable us to better understand infectious disease dynamics, and can thus be used to help inform management strategies. However, the use of mathematical models without robustly fitting to observed data can lead to poor model predictions and inference, in turn hindering scientific enquiry and increasing the probability of making poor policy decisions. Fitting dynamic transmission models to observed data is highly challenging, since available data is incomplete, and thus standard statistical approaches that rely on estimation of the likelihood function cannot be employed.

We will extend recent advances in simulation-based Bayesian inference methods, which have shown great utility in overcoming these difficulties. These approaches are flexible and tractable, but can be computationally demanding. This project will extend recent advances in the field to deal with key challenges, both in the scaling up of these methods to larger systems, and also in dealing with the complexities that typify wildlife disease systems, such as: incomplete longitudinal sampling of individuals (i.e. capture-mark-recapture), the application of multiple diagnostic tests, uncertainties in diagnostic test performance, complex spatial and meta-population structures, and demographic changes over time. We will explore the development of constrained simulation techniques, which have been shown to greatly improve the efficiency of these inference algorithms in small populations, and hence are good candidates for improving efficiency in larger, more complex populations. We will also explore the use of these algorithms to allow for the fitting and comparison of different transmission models, again extending recent work in the field.

We will ground our research using the high-profile case study of bovine tuberculosis in badgers, which suffers from all of the system-uncertainty and data-quality issues described above. Additionally, the disease has a direct impact on the livelihoods of UK farmers, major policy decisions that influence voter behaviour, and the conservation and management of UK wildlife. We will use an unprecedented 40+ year longitudinal study of bTB in a natural, wild population of badgers, to provide a unique and powerful insight into the aetiology of the disease. Although we focus on wildlife disease systems in this project, the methodological advances developed will be applicable to a wider range of state-space systems.

Planned Impact

Wildlife infectious diseases have major impacts on both wildlife (and domestic animal) populations, as well as human populations, as evidenced by recent outbreaks of diseases such as bovine tuberculosis, chytridiomycosis, Ebola and Zika viruses. Mathematical models provide a tractable means to understand the mechanisms of spread of a pathogen in a population, and potentially to predict the future course of an outbreak. However, the utility of these models is linked to the ability to fit them to observed data, and many epidemiological events (such as infection and recovery) are rarely (if ever) fully observed. Instead, the missing information must usually be inferred as part of a model fitting process, and it is important that the uncertainties due to the missing information are correctly characterised in the parameter estimates and model predictions. The increasing use of complex models without accounting for the uncertainties surrounding missing data/hidden states and/or different choices of model will lead to poor predictions, which will hinder scientific enquiry and increase the probability of making poor policy decisions.

This project will extend recent advances in model fitting and model comparison for state-space models in the presence of missing information to develop a pipeline that can be used for model inference, comparison and prediction. As such, the methodology developed would be applicable to a wide range of different state-space systems and thus could be an important addition to the toolbox of methods available to epidemiologists, and facilitate their adoption more widely outside of the statistical community. With the impacts of infectious diseases becoming more widely realised and technology for data collection rapidly improving, this research has potentially important consequences for developing and fitting statistical models to a wide range of important infectious disease systems. Importantly, the ability to more easily implement these methods could help to add to the suite of real-time modelling tools available for modelling epidemic outbreaks.

In addition, bovine tuberculosis (bTB) is a disease with huge socio-economic impacts in Great Britain. It currently costs the UK government >£100 million per year for surveillance and for compensation to cattle owners when infected animals must be culled, and in the face of intensive control measures incidence of the disease in cattle herds has increased dramatically over the past 20 years. In addition, EU legislation requires that the UK government have in place a TB eradication policy. Control of the disease is complicated by the presence of a wildlife reservoir of infection, the European badger (Meles meles). The involvement of a second species complicates disease-control, not least because the dynamics of M. bovis infection in badgers is largely unobserved, due to a national lack of systematic routine surveillance in badger populations. As a result, important mechanisms of within-species spread and persistence of the disease are still poorly understood, and improving our understanding of these mechanisms could shed an important light on the potential for the eventual eradication of the disease in both badger and cattle populations. This project will use detailed data from Woodchester Park in Gloucestershire, where the world's most detailed longitudinal study of badger populations and bTB is being conducted. We will use these data in order to build and compare between competing models of varying complexity to help understand how the disease spreads and is maintained in badger populations. The methodologies and findings from this study will also have wider implications for the study of other wildlife diseases.


10 25 50
publication icon
Hudson DW (2023) Multi-locus homozygosity promotes actuarial senescence in a wild mammal. in The Journal of animal ecology

Description We have extended recent advances in simulation-based statistical inference techniques, notably the individual forward filtering backward sampling algorithm, to work efficiently in a large-scale 40+ year longitudinal study of bovine tuberculosis in badgers. This approach adapts to key challenges when modelling wildlife diseases, such as the use of capture-mark-recapture data, individual test histories/diagnostics, demography, multiple diagnostic tests, individual- and group-level variation in infectivity, susceptibility and test performance, as well as spatial meta-population structures. It also handles non-Markovian mortality structures, which are also of key importance when studying endemic wildlife diseases. The models can fit to data sets of the region of >2,000 individuals across 40+ years in a matter of hours on a desktop machine, which is remarkably efficient for these types of methodology and makes a powerful case for the utility and efficacy of these approaches for the study of wildlife diseases and other systems. Papers are currently being prepared, but a pre-print of the first paper can be found here: https://doi.org/10.1101/2024.01.26.576600

This has currently fulfilled Objective 1 and most of Objectives 2 and 3 from the grant application, and a paper is currently being written up on this work. To this end the model results when fitted to the Woodchester Park data provide evidence of super-spreading individuals and marked heterogeneity of infectivity (as measured by novel individual-level reproduction number estimates) across individuals. The model fits the observed data well and can produce predictive information at the level of individual animals (for example, the probability of infection/infectivity/death over time, estimates for the time-of-death or other epidemiological events amongst others). These can be aggregated up to social group and population levels, with estimates of uncertainty, and can be used to derive population-level reproduction numbers which are consistent with other studies. Thus we have addressed key questions 1, 3 and 4 highlighted in the application. We haven't yet built a model that incorporates explicit between-group transmission, outside of movements of badgers between social groups, and so have only partially addressed key question 2 at the current time.

We are currently working on extending these models to explore the effects of e.g. sex on mortality and transmission potential, and the framework is in place to allow us to explore various epidemiological ideas of interest. Once this is done we will have completed Objectives 2 and 3. All the code is written in open-source software and will be hosted on GitHub, contributing towards Objective 4 of the grant. Future work will aim to embed some of these ideas in the SimBIID R package to make them more readily accessible to other disease ecologists.
Exploitation Route The project highlights the utility of recent advances in simulation-based inference for fitting complex infectious disease models to partially observed data in an efficient and flexible way. Although these methods do not solve all challenges in inference for infectious disease systems, they show good utility where they are relevant (systems with small- to medium-sized populations), and provide highly detailed outcomes at the individual-level, which can be aggregated up to lower resolutions as required. We will provide open-source code to replicate all these models, which could be utilised by other researchers for modelling their own systems. These approaches also have the potential to be more readily integrated into general purpose software than more traditional approaches, and so future work will aim to make steps in that direction, which will greatly facilitate their adoption and extend the range of modelling tools available to infectious disease researchers. Code is openly and freely available here: https://github.com/evandrokonzen/WP_bTB_code. Code will also be published on the Environmental Information Data Centre in due course. We already have plans to develop these methods further, with the eventual aim to produce software to facilitate wider adoption.
Sectors Healthcare


Title Individual forward filtering backward sampling algorithm for modelling bovine tuberculosis spread within a population of wild badgers in Woodchester Park 
Description Code to fit a transmission model of bovine tuberculosis spread to a population of wild badgers in Woodchester Park in the UK. The code produces Markov chain Monte Carlo samples from a model fitted to individual-level badger data from Woodchester Park. The badger data came from and can be requested from the Animal and Plant Health Agency. This code will also be published via the Environmental Information Data Centre, with an associated DOI in due course. 
Type Of Material Computer model/algorithm 
Year Produced 2023 
Provided To Others? Yes  
Impact None yet, outside of important biological insights that are currently in submission, but can be found as a pre-print here: https://doi.org/10.1101/2024.01.26.576600 
URL https://github.com/evandrokonzen/WP_bTB_code
Description Animal and Plant Health Agency - Woodchester Park 
Organisation Animal and Plant Health Agency
Country United Kingdom 
Sector Public 
PI Contribution We are provided methodological expertise for analysing the data collected from the APHA Woodchester Park study.
Collaborator Contribution The APHA manage the Woodchester Park longitudinal study of badger populations, that has been ongoing for more than 40 years.
Impact None yet, but research is ongoing.
Start Year 2021