High-resolution semi-parametric Bayesian modelling of human contact dynamics.

Lead Research Organisation: Imperial College London
Department Name: Mathematics

Abstract

This project falls within the EPSRC mathematical sciences research area.

Project summaries:
Infectious diseases, such as HIV, are transmitted through close human sexual contacts. To understand the disease spread and characterise dynamic human sexual behaviours within a closed population, statistical models are essential to infer contact patterns at high resolution (1-year age bands by gender). The contact patterns, encoded in contact matrices, quantify the number of sexual partners in a year with a specific age-gender population by one person in a target age-gender group. In the context of HIV, the contact patterns are inferred based on unprecedented sexual contact data from the longitudinal open Rakai Community Cohort Study (RCCS) in Uganda, East Africa, where the reporting bias and survey limitations exist. Existing literature shows evidence of the under-reporting issue of women, which introduces the reporting bias by gender. However, the potential factors that would cause the under-reporting issue of women are still unclear. We leverage the symmetric property of contacts and rely on the male-reported contact data to capture the under-reported contacts of women. Additionally, we analyse the occupations of women in order to explore the distribution of the under-reporting issue among women. Another issue of the survey is that participants reported without much detail on the partner involved, which introduced uninformative contacts. To this end, collaborating with the Rakai Health Sciences Program, we develop a semi-parametric Bayesian model which estimates high-resolution contact patterns that adjusts for the potential effects of under-reporting contacts and uninformative contacts in a unified framework. In the framework, contact patterns are described through random functions, and computationally efficiently approximated with Hilbert Space Gaussian processes priors.

We focus on 40 surveyed communities in inland and fishing areas around Lake Victoria, and on two survey rounds from August 2011 to August 2013 and from June 2018 to November 2020, respectively. We uncover substantial under-reporting of sexual contacts by women within their communities in the high-HIV-prevalence fishing communities, and less so in the lower-prevalence inland communities. The under-reporting adjusted sexual contact intensities in fishing communities are nearly two-fold higher than in inland communities, and their age structure differs considerably. Over time, we find no significant changes in sexual contact patterns in inland and fishing communities respectively. Our findings indicate that there were no substantial changes in sexual contact behaviour from 2011 to 2020 in Rakai, Uganda that could causally explain the marked changes in HIV incidence during the same time period. Additionally, we investigate certain occupations (such as sex work) of women that significantly correlate to under-reported contacts.

In applying our model to the RCCS contact survey, we provide a detailed picture of the sexual contact patterns by gender. This work promises to aid the understanding of sexual contact behaviour in East Africa, more realistic parameterisations of infectious disease models, and a deeper understanding of how HIV-related diseases are propagated through populations.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

People

ORCID iD

Yu Chen (Student)

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 31/03/2019 29/09/2027
2602755 Studentship EP/S023151/1 30/09/2021 29/09/2025 Yu Chen
EP/T51780X/1 30/09/2020 29/09/2025
2602755 Studentship EP/T51780X/1 30/09/2021 29/09/2025 Yu Chen