New methods for model construction and calibration using machine learning and mathematical modelling.

Lead Research Organisation: University of Oxford
Department Name: Mathematical Institute

Abstract

An increasing availability of biological data should inform researchers of previously undiscovered mechanisms in biological systems, across temporal and spatial scales. Currently, many such systems are modeled with partial differential equations (PDEs). However, for most problems competing PDE models exist, some with different interpretations. Motivated by recent advances in equation discovery methods, this project will focus on extending methods such as biologically informed neural networks (Lagergren, Nardini, Baker, Simpson, & Flores, 2020) and other computational approaches to obtain data-driven PDE models. In so doing, we hope to contribute to the problem of model validation. This is of interest mathematically as well as biologically, as discovering new terms in PDE models has already shown to lead to new biological insights, such as the possible role of cell damage in proliferation assays (Lagergren, Nardini, Baker, Simpson, & Flores, 2020).

Given that the goals of mathematical models in biology range from building predictive models to quantifying parameters that cannot be measured (Browning, Warne, Burrage, Baker, & Simpson, 2020), it is of vital practical importance to obtain reliable estimates from available data. Further, when calibrating models, it is important to quantify uncertainty in model parameters (and, consequently, in model predictions). Uncertainty quantification provides not only information about the possible error of the model but can also suggest whether the available data provide strong evidence in favour of the modelling choices or that other choices ought not be excluded. Currently, there is no accepted way to do this analysis for equation learning methods, while it is of vital importance if reliable insights into the underlying dynamics are to be found.

Another issue in obtaining reliable estimates from available data is parameter identifiability (Simpson, Baker, Vittadello, & Maclaren, 2020). Whether reliable estimates can at all be obtained from the data has important ramifications for both the predictive power of a model, and the mechanistic insight that can be obtained (Simpson, Baker, Vittadello, & Maclaren, 2020). A means to develop a way to quantify practical identifiability of functions in equation discovery methods is hugely important. Such insights can aid design experiments yielding the data necessary to discriminate between competing models.

A principal aim is to use the methods outlined above to estimate parameters and calibrate models in several applied problems where there is a large amount of available data. An initial investigation will be concerned with siRNA screens of wound healing assays where the aim would be to infer the link between gene perturbation and phenotype.

This research is mainly contained within the EPSRC research area of mathematical biology. Within mathematical biology, we point out the connection with the themes of healthcare technologies and mathematical sciences in general.

This project also relates to the EPSRC's research areas of Artificial Intelligence Technologies as it works on uncertainty quantification of neural network predictions, Statistics and Applied Probability as we investigate stochastic models and finally Non-Linear Systems as the PDE models under investigation often contain non-linearities.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/V520202/1 01/10/2020 31/10/2025
2426446 Studentship EP/V520202/1 01/10/2020 30/09/2024 Simon Martina Perez