Causal machine learning for multiple treatments and multiple outcomes in dynamic treatment regimes

Lead Research Organisation: University of Oxford

Abstract

It has recently been shown that patients need more personalised treatments that can evolve over time depending on their response. This is formalised through a dynamic treatment regime, a sequence of decision rules over multiple time points where the next treatment is chosen based on patient history - former states, former treatments, current state. In addition, in a context where multimorbidity and polypharmacy are rising, a single medication might impact not one but several conditions, while multiple medications might simultaneously be required to address multiple conditions. However, this fact is overlooked in current clinical trials, which most often address one intervention over one disease. For example, patients with both cardiovascular disease and high blood blessure might be excluded from studies on cardiovascular disease.

Thereby, the general aim of this project is to construct reliable causal inference for multiple, parallel treatments and outcomes within each time point of a dynamic treatment regime. Particularly, this causal inference is aimed to be of use for modern longitudinal observational data, such electronic health records, that provide an alternative to sequential randomised trials.

A key aspect of research pertaining to these aims is developing reliable dimensionality reduction techniques suited for longitudinal studies. Indeed, the dimension of the data grows linearly with the time point, as each state and treatment decision is added into the history used to decide the next treatment, and as a consequence, the number of states and treatments grows exponentially. We will explore modern machine learning techniques such as representation learning and deep generative modelling to provide such low-dimensional spaces.

In addition, these spaces will be usable in settings such as matching. In a static, single-intervention setting, matching is a methodology for forming similar (or "balanced") treatment and control groups. Despite its ubiquitous use in the medical and social sciences, relatively little work has been done to extend it in a longitudinal setting, let alone in dynamic treatment regimes. We will aim at providing a generalisation of matching to a multi-treatment and time-varying setting thanks to a more suitable notion of balance and thanks to adequate dimensionally-reduced spaces. As a result, we will in part circumvent the challenge of regression in the dynamic treatment regime, where the non-regularity of the outcome function remains an open research problem.

This project falls within the EPSRC "Artificial intelligence technologies", "Statistics and applied probability" and "Healthcare technologies" research area. It is co-funded by Novo Nordisk and is supervised by Professor Chris Holmes.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2635641 Studentship EP/S023151/1 01/10/2020 30/09/2024 Oscar Clivio