Real-time phylogenetics using sequential Monte Carlo with tree sequences

Lead Research Organisation: University of Warwick
Department Name: Statistics

Abstract

The COVID-19 pandemic has highlighted the importance of a number of different scientific fields in helping to tackle the spread of an infectious disease. One of these fields is pathogen genomics, which has, through the analysis of sequenced SARS-CoV-2 genomes, enabled the detection and tracking of different variants of the virus. The UK is particularly strong in this area, with the COVID-19 Genomics UK Consortium (COG-UK) providing important inputs into the government response to the pandemic.

Rapid sequencing of pathogen genomes is now possible. One possible use of this data is in real-time tracking of pathogen evolution and transmission through reconstruction of the ancestral history of sequenced genomes (phylogenetic inference). The pandemic has shown the value of having such information available (see, for example, the work of Nextstrain, https://nextstrain.org/).

The state-of-the-art in phylogenetic inference is to use Markov chain Monte Carlo (MCMC) algorithms for the Bayesian inference of the ancestral history, preferred due to its philosophy of rigorously describing the uncertainty associated with inferences drawn from the data. This approach is implemented in the BEAST (https://beast.community/) and BEAST2 (beast2.org) packages but in their current form these are unsuitable for "real-time" inference, since they perform inference on a batch of genome sequences. If a new sequence becomes available after starting the a run of the MCMC in the software, the algorithm must be restarted to take account of the new data. These MCMC algorithms are often run for tens of millions of iterations, so this process of restarting the algorithm is computationally wasteful and hinders the goal of real-time inference. This project proposes an alternative approach, with the aim of making real-time Bayesian inference feasible for large numbers of sequences, preparing the ground for a deployable system that could be used during a pandemic.

Publications

10 25 50
 
Title ilike 
Description Software for Bayesian inference for intractable models. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This software allows researchers to access a number of state-of-the-art algorithms for Bayesian Computation. 
URL https://github.com/maugu/ilike