Simulation Package for Efficient Experimental Design and Inference in Microbiology

Lead Research Organisation: University of Cambridge
Department Name: Veterinary Medicine

Abstract

There is growing recognition within biological sciences that mathematical modelling provides powerful methods to improve our quantitative understanding of the dynamics of biological systems. Harnessing the full potential of these methods requires a complete integration of experimental data and dynamic models within the proper statistical framework. However, progress in this area has been patchy: while some fields of biology (systems biology) lead the way, others are lagging behind. Our proposal aims to develop and deliver a free computational package that will facilitate the complete integration of dynamic models and laboratory experiments, with an initial focus on research into host-pathogen interactions.
This open-source software will have two related and essential functions:
- statistical inference (SI): given a mechanistic model combining current knowledge and hypotheses about a biological system, how much information can be extracted from new experimental data about mechanisms that cannot be directly observed?
- Optimal experimental design (OED): given a mechanistic model and preliminary data, what is the best way to design an experiment within set (budgetary or technical) constraints in order to maximise the expected gain of information?
Recent progress in scientific computing has allowed the rapid development of algorithms for SI and OED, but they have been applied independently to other areas of research. Our project will deliver the first "one-stop shop" for inter-disciplinary research projects in microbiology. We will use state-of-the-art methods from applied statistics and tailor them to the specific needs of experimental biologists. An important novelty will be our focus on stochastic simulations, which allow random variations in the dynamics of a system: as in experiments with living organisms, repeats of the same procedure never yield exactly the same results. Because they capture this essential feature of real systems, stochastic models allow more reliable and accurate inference, albeit at the cost of greater computational complexity. Our many years of expertise at the interface of statistical modelling and experimental biology put us in a very strong position to tackle these challenges.
This 18-month project will enable us to develop and test the functionality of the package with two experimental systems using existing and new data, before releasing it for free and public use in inter-disciplinary biological research. The software will be delivered as a package for use within the R software, which is a free statistical platform.

Technical Summary

We will develop Bayesian statistical tools for statistical inference (SI) and optimal experimental design (OED) for stochastic mechanistic models. Various techniques have been proposed in recent years and applied to physical sciences and systems biology. We will use our expertise in mathematical modelling and Bayesian statistical inference in microbiology to bring together a set of tools that can work together and provide useful information for experimental microbiologists.
Most existing methods for Bayesian SI with stochastic models are based on so-called Approximate Bayesian Computation (ABC), which substitute likelihood with summary statistics in order to identify simulations that are most similar to the data. Although computationally efficient, these methods can provide incomplete or even unreliable predictions if the summary statistics chosen are not sufficient. Therefore, we will also propose an alternative approach which combines stochastic simulations and observational noise to approximate the likelihood through importance sampling.
We will exploit recent progress in Bayesian OED to build an efficient tool that can be combined with our stochastic inference methods. Although the use of stochastic mechanistic models within OED has been considered theoretically (e.g. in systems biology), to our knowledge it has not been demonstrated in practice. This will be our greatest challenge, but one we are ideally placed to address. Together with our collaborators, our joint expertise in statistical modelling and experimental biology will enable us to concentrate on very specific specifications, that can then be broadened for applications to other fields.
Testing of our software will be performed with two experimental systems as part of ongoing research projects led by our close collaborators: a murine model of typhoid infection, and an invertebrate model of enteric bacterial symbiosis.

Planned Impact

There are growing ethical concerns surrounding the use of animals in scientific research. Caught between the imperatives of animal welfare and translational research, biologists need tools and practical guidance to pursue scientific research that contributes to our health, knowledge and well-being at a cost that society can accept to bear. The 3Rs (replace, reduce, refine) provide a framework for animal research without imposing targets, encouraging the development of innovative approaches. In areas where replacement of animals is not yet possible or needs validation, scientists are required to combine state-of-the-art experimental techniques with mathematical and statistical models in order to reduce and refine the use of animals. Traditional statistical tools for experimental design, such as power calculation, have been designed to predict simple linear relationships between experimental factors and observed variables. In living systems however, non-linear responses mediated by complex processes are the norm. Mechanistic models aim to reproduce parts of these complex processes in order to generate accurate predictions of the system's response to a given experimental treatment. Developing, refining and validating these mechanistic models are essential steps towards replacement of animals in experimental research. Our package will facilitate this process by providing scientists with the tools they need to perform two crucial statistical operations: inference and experimental design. Although the theory underlying these operations in the context of mechanistic models is well established, their practical implementation for real-life problems was until recently hindered by computational limitations. These challenges have started to be overcome in the last ten years, and we are now in an ideal situation to translate the latest progress in applied statistical research into practical solutions for life scientists.
By making our package available for free in the R environment, and enabling anyone to reuse, adapt and modify it under an Open Source General Public License (www.opensource.org), we hope to be able to reach a broad scientific community. Thus our package will contribute to equipping experimental scientists with the means to carry out high-impact research within the 3Rs framework.
 
Description We have developed a new statistical method to analyse experimental data on bacterial infection dynamics. The method is both accurate and much more computationally efficient than previous ones. We applied the method to the results of a study of typhoid vaccines in mice, which aims to compare a new vaccines with an existing vaccine. Both vaccines are only partially effective against systemic infection by Salmonella (the causative agent of typhoid), but the exact differences are difficult to tease apart. Our analysis shows distinct effects, first in the initial stage of infection, when the older vaccine promotes rapid killing of bacteria in the blood; later, during colonisation of organs by Salmonella, mice immunised with different vaccines differ in their ability to control the local growth and systemic spread of infection.
More recently, we have completed the development of our analytical toolkit to meet our original objectives. Firstly (Vlazaki et al 2020), we have applied our model to a new experimental dataset to show that antibiotic treatment may not act uniformly on Salmonella bacteria within an animal, supporting previous findings that a subset of slow-dividing bacteria may better survive antibiotic treatment. Secondly, we have published a proof-of-concept to show how our toolkit can be used to optimise experimental design, in particular the allocation of a fixed number of laboratory animals between time points (Vlazaki et al 2021).
Exploitation Route It will enable us and others to analyse and design microbiology experiments, contributing to a more efficient use of laboratory animals.
Our method and results were published in PLoS Computational Biology in 2017, and I have approached other labs who are interested in using it. In particular, I obtained a travel grant to visit a lab in the USA in September 2017. This has led to a collaboration on a different pathogen, Bordetella bronchiseptica, and I am currently analysing the results of the first experiment using the package.
Our latest publications, in particular Vlazaki et al (2020) and Vlazaki et al (2021), provide clear hypotheses and pathways for future experimental studies. The former adds to the growing evidence that clonal bacterial populations exhibit heterogeneity that affects the response to antibiotics in vivo, and offers a new method to assess this effect. The latter paper demonstrates how animal experimental design can be optimised with the use of our modelling package.
Sectors Pharmaceuticals and Medical Biotechnology

URL https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005841&rev=1
 
Title Moments-Based Inference Method for Within-Host Bacterial Infection Dynamics 
Description Over the last ten years, isogenic tagging (IT) has revolutionised the study of bacterial infection dynamics in laboratory animal models. However, quantitative analysis of IT data has been hindered by the piecemeal development of relevant statistical models. The most promising approach relies on stochastic Markovian models of bacterial population dynamics within and among organs. Here we present an efficient numerical method to fit such stochastic dynamic models to in vivo experimental IT data. A common approach to statistical inference with stochastic dynamic models relies on producing large numbers of simulations, but this remains a slow and inefficient method for all but simple problems. Instead, we derive and solve the systems of ordinary differential equations for the two lower-order moments of the stochastic variables (mean, variance and covariance). For any given model structure, and assuming linear dynamic rates, we demonstrate how the model parameters can be efficiently and accurately estimated by divergence minimisation. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2017 
Impact None as yet 
URL http://www.biorxiv.org/content/early/2017/03/13/116319