Bayesian reconstruction of infectious diseases transmission flows at the individual and population levels

Lead Research Organisation: Imperial College London
Department Name: Mathematics

Abstract

1 Introduction
My PhD thesis will consist of two projects that reconstruct infectious disease transmission flows at the individual and population level. The first project focuses on reconstructing HIV-1 transmission chains using deep sequence data from Uganda. Transmission chains are the fundamental building blocks of the dynamics of any infectious disease. Estimating the time of infection, pairwise linkage, and direction of transmission give a unique insight into the disease dynamic, including likely receiver and spreader groups within a host population.
The second project investigates the reconstruction of age-specific COVID-19 transmission flows in the United-States. Since mid June, the daily number of reported COVID-19 cases has re-surged in the United States, surpassing 40,000 daily reported cases on June 26 [1]. Considering dynamics for the United States, we analyse aggregated, age-specific mobility trends from more than 10 million individuals and link these mechanistically to age-specific COVID-19 mortality data. In contrast to previous approaches, we link mobility to mortality via age specific contact patterns and use this rich relationship to reconstruct accurate transmission dynamics.
2 Aims and objectives
In both project we are trying to reconstruct transmission flows to understand the underlying factors that drive the epidemics. The motivation of the first project is to identify, in a host population infected with HIV-1, pairwise linkage and direction of transmission as well as the time of transmission. We reconstruct transmission chain (whom-transmittedto-whom) in a base population in Uganda.
In the second project, we present a fundamentally new view of the US epidemic until August 2020 based on analysis of novel, large-scale, population-level data. We assess how the age specific movement of individuals and their contact patterns affects transmission and COVID-19-attributable deaths. We identify the population age groups driving SARS-CoV-2 spread, and quantify the likely impact of school re-opening on case and death counts under the scenario that transmission from the age groups that primarily drive transmission continues uninterrupted.
3 Novelty of the research methodology
In the first project, we use new deep-sequence data collected in a large population-based sample of infected individuals in Rakai District, Uganda. We develop novel likelihoods for both data sources. Then, we derive the prior of the transmission chain that included times of infection and the source vector. We create a Metropolis-Hastings algorithm, to sample from the transmission chain posterior. To establish proof of principle, we investigate pre-selected 7 of the 493 transmission networks from Ratmann et al. [2].
In the second project, we incorporated our mobility data into a Bayesian contact-and infection model that describes time-changing contact and transmission dynamics at state and metropolitan area-level across the United States. SARS-CoV-2 spreads via personto-person contacts. The contact intensities are used to estimate the rate of SARS-CoV-2 transmission, and subsequently infections and deaths. Infection dynamics in each location are modelled through age-specific, discrete-time renewal equations over time-varying contact intensities. We fitted the model on age-specific mortality counts.
4 Alignment to EPSRC's strategies and research areas
This project falls within the EPSRC Mathematical sciences research area.
5 Contributors
The 'Phylogenetics and Networks for generalized HIV Epidemics in Africa' consortium (PANGEA-HIV) is involved in the first project and The Imperial College COVID-19 Response Team participated in the second project.
References
[1] Johns Hopkins University. "COVID-19 Dashboard". Available at https://coronavirus. jhu.edu/map.html. 2020.
[2] Oliver Ratmann et al. "Inferring HIV-1 transmission networks and sources of epidemic spread in Africa with deep-sequence phylogenetic analysis". In: Nature Communic

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2641932 Studentship EP/S023151/1 01/10/2019 31/12/2022 Melodie Monod