Practical Time-Series Modelling for Scientific Data

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

The analysis of time-series data is central to many scientific domains, such as electrochemistry, cardiac electrophysiology, and pharmacokinetics. In these and other fields, sophisticated computational modelling, Bayesian techniques, and machine learning are employed for the analysis of time-series. In this project, we consider several research directions which enable more principled, accurate, and efficient inference for time-series, with application to real-world scientific datasets.
The overarching goal will be to provide to the research community a practical guide to the use of machine learning and (particularly) Bayesian Inference techniques in the parameterization of mathematical models of physical and biological systems, where the experimental interrogation of those systems results in time series data. We have identified a range of topics (arising from previous research in our group) that will form the initial directions of this project. Each of these is discussed briefly below.
Correlated noise. For simplicity, many existing time-series models do not consider correlation of noise. However, it is increasingly recognised that many noise processes are autocorrelated. Particularly worrying is that the simplifying assumption of no correlation in the noise can lead to underestimation of uncertainty in parameter inference. By extending previously used time-series models to capture correlation in the noise, we intend to enable more accurate inference and overcome previously observed erroneous results.
Model misspecification. Models used for time-series data are often misspecified, usually due to an incomplete understanding of the system. We propose to investigate the detection of model discrepancy for time-series, including a focus on explaining the effects of model discrepancy on parameter inference. This area will be approached initially through in silico experiments using model problems so that the degree of model misspecification can be controlled, and its effects quantified, before applying the techniques developed to real-world problems.
Emulation of the likelihood. In many scientific applications, computational cost is a major bottleneck - in particular, evaluation of the likelihood can be highly costly. Emulation is a computational strategy which approximates the likelihood with a function that is cheaper to evaluate. Although emulators can enable significant speedups, concerns remain over their accuracy. We plan to consider using emulators within a sampling strategy that corrects for this inaccuracy via an accept/reject step potentially allowing exact sampling of a posterior distribution. The speedup enabled by this approach, combined with existing sophisticated Markov Chain Monte Carlo (MCMC) algorithms, could be key to increasing the tractability of computationally intensive scientific problems.
Automated selection of hyperparameters. The behaviour of MCMC sampling algorithms is typically governed by several tuning parameters (hyperparameters). These hyperparameters typically have a drastic effect on the performance of MCMC algorithms, but the ideal values may not be obvious and may vary from problem to problem. We propose to develop machine learning methods by which tuning parameters can be set to obtain optimal performance. For example, we consider the problem of maximizing some given metric of sampler performance (such as effective samples generated per unit of time) over MCMC hyperparameter configurations.
Remit. This project falls within the EPSRC Mathematical Sciences research theme. The particular research areas covered by this project include Artificial Intelligence Technologies, Mathematical Biology, and Statistics and Applied Probability.
Companies and collaborators involved. All theoretically developed approaches will be tested on real-world data provided by our collaborators at in the Chemistry Departments at York and Monash University, and at the Roche Innovation Center in Basel.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513295/1 01/10/2018 30/09/2023
2285274 Studentship EP/R513295/1 01/10/2019 31/03/2023 Richard Creswell