Deep Poisson process pathogen phylodynamics to accelerate understanding in disease transmission
Lead Research Organisation:
Imperial College London
Department Name: Mathematics
Abstract
The discovery of deciphering the human genetic code has been a landmark scientific achievement, leading to the development of personalized medicine, gene therapies, or modern vaccines. Today, the genetic codes of major organisms, and viral or bacterial pathogens that compromise human health are identified (or sequenced) at low cost and at industrial scale, including for the purpose of reconstructing how infectious diseases spread in human populations, and how to stop spread.
The genetic relationships of pathogen variants provide objective data about who infected who, information that is otherwise hard to obtain. The mathematical and statistical theory that underlies the analysis of these data is called 'phylodynamics'. This theory has made possible to reconstruct and quantify how anti-microbial resistant pathogens have spread worldwide, in communities, or in hospital wards, or how novel COVID-19 variants emerge and replace each other.
This EPSRC project aims to develop a novel class of statistical phylodynamic theory, grounded in deep Poisson point processes, that are substantially more flexible and computationally faster than existing methods. Our preliminary findings indicate this approach has the potential to unlock the analysis of important questions about the age, behavioural characteristics, locations, mobility patterns or other characteristics of population groups that are the sources of pathogenic spread, and which to date are very challenging or impossible to address. We will develop the statistical theory and provide open-access and computationally scalable code for flexible and reproducible analyses. This project will benefit from close ties to the Machine Learning & Global Health network (development of deep non-parametric methods), the international PANGEA-HIV consortium (access to large-scale, rich and globally important data collected over the past 10 years), the UK Health Security Agency (aiming to use our methods in the UK) and to Oxford Nanopore (transitional industry impact).
The genetic relationships of pathogen variants provide objective data about who infected who, information that is otherwise hard to obtain. The mathematical and statistical theory that underlies the analysis of these data is called 'phylodynamics'. This theory has made possible to reconstruct and quantify how anti-microbial resistant pathogens have spread worldwide, in communities, or in hospital wards, or how novel COVID-19 variants emerge and replace each other.
This EPSRC project aims to develop a novel class of statistical phylodynamic theory, grounded in deep Poisson point processes, that are substantially more flexible and computationally faster than existing methods. Our preliminary findings indicate this approach has the potential to unlock the analysis of important questions about the age, behavioural characteristics, locations, mobility patterns or other characteristics of population groups that are the sources of pathogenic spread, and which to date are very challenging or impossible to address. We will develop the statistical theory and provide open-access and computationally scalable code for flexible and reproducible analyses. This project will benefit from close ties to the Machine Learning & Global Health network (development of deep non-parametric methods), the international PANGEA-HIV consortium (access to large-scale, rich and globally important data collected over the past 10 years), the UK Health Security Agency (aiming to use our methods in the UK) and to Oxford Nanopore (transitional industry impact).
Publications
Bu F
(2024)
Inferring HIV transmission patterns from viral deep-sequence data via latent typed point processes.
in Biometrics
Description | 1/ This work contributed to substantiate earlier findings that the primary HIV transmission pathway into young & adolescent women is via substantially older men, typically >6 years older. |
Exploitation Route | 1/ The outcomes under point (1) listed above may be used to update HIV prevention programming in Southern and Eastern Africa. |
Sectors | Healthcare |
URL | https://www.unaids.org/en/resources/documents/2023/global-aids-update-2023 |
Description | The findings from this award have substantiated earlier evidence that the primary transmission pathway into young & adolescent women is via substantially older men, typically >6 years older. These findings have been reported in the UNAIDS Global AIDS Update 2023, and thereby reached a broad audience specialising in global health and HIV prevention. |
First Year Of Impact | 2024 |
Sector | Healthcare |
Impact Types | Policy & public services |
Title | Code for Poisson point process model for analysing transmission sources |
Description | This technology asset comprises the code for the Poisson point process model for analysing transmission sources. |
Type Of Material | Technology assay or reagent |
Year Produced | 2024 |
Provided To Others? | Yes |
Impact | Pending |
URL | https://github.com/fanbu1995/HIV-transmission-PoissonProcess |
Description | PANGEA-HIV |
Organisation | University of Oxford |
Department | Big Data Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Methodological development for analysis of HIV deep-sequence data. |
Collaborator Contribution | This grant contributed to providing novel statistical methods for the analysis of infectious disease transmission sources from HIV deep-sequence data. The particular novelty of the contribution from this grant derives from statistical point process models that can analyse information on direction of transmission computationally more efficiently than previous approaches, due to avoiding a discretisation and full enumeration of the state space of the target variables. |
Impact | Pending - our methodological contribution was just published in 2024. |
Start Year | 2017 |
Title | code for Inferring HIV transmission patterns from viral deep-sequence data via latent typed point processes |
Description | The software provides the code for the Poisson process model developed for inferring transmission sources from deep sequence data |
Type Of Technology | Software |
Year Produced | 2024 |
Open Source License? | Yes |
Impact | Pending |
URL | https://github.com/fanbu1995/HIV-transmission-PoissonProcess |
Description | Real-time HIV molecular epidemiology |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Policymakers/politicians |
Results and Impact | Use of HIV molecular epidemiology for public health surveillance, especially in relation to 1/ the application of HIV genomics for clinical management and public health surveillance; 2/ Bioinformatic approaches and tools for HIV genomics in public health surveillance; 3/ Ethical considerations in the use of HIV phylogenetics for public health surveillance; 4/ The role for HIV molecular epidemiology in informing public health policy for different settings; 5/ Governance of UK HIV Drug Resistance Database at UKHSA |
Year(s) Of Engagement Activity | 2024 |