đŸ“£ Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Enhancing the design and analysis of cluster randomised trials using machine learning

Lead Research Organisation: London School of Hygiene and Tropical Medicine
Department Name: Epidemiology and Population Health

Abstract

Trials are important to evaluate the safety and efficacy of new treatments or interventions. One type of trial, called a cluster randomised trial (CRT) uses pre-existing groups of individuals - known as clusters - who are randomly allocated to different treatments. It means that every member of the same cluster, for instance members of the same family or patients from the same hospital, will receive the same treatment. This type of trial is very useful in real world settings where individual randomisation to treatments is not possible or the intervention is naturally applied to a whole cluster. However, this type of trial requires the use of specific analysis methods to account for the similarity in the response to treatments among members of the same cluster. Moreover, cluster randomised trials are often prone to bias, which can be corrected at the analysis stage with the use of additional information about the individuals and clusters included in the trial. This additional information can also be used to improve the precision of the trial, and to identify the individuals who will most benefit from the intervention being tested. Therefore, it is crucial to select the important variables and use appropriate statistical methods accounting for these variables to obtain an accurate estimation of the treatment efficacy and safety. However, the best way to do so remains unknown, especially in an era where there is a large amount of medical information available. In this fellowship I will draw on the emerging field of machine learning to address these methodological challenges and to determine how routinely collected medical data can be best used to improve the design of CRTs. I will use several existing trial datasets to achieve this, including CRTs in the fields of pharmacy and psychiatry, as well as data from the England Cancer Registry. Drawing on these multiple datasets but also using mathematical developments and computer-based simulation studies, I will develop and evaluate methods to improve the generalisability of CRTs, their precision, and bring us one step closer to a more personalized medicine approach for patients.

Technical Summary

Pragmatic trials emerged in response to concerns that clinical trials, traditionally designed to assess efficacy, were failing to inform clinical practice. Assessing benefits and harms in highly selected patient populations under well-trained experienced clinical teams can lead to over-estimation of benefit and underestimation of harm in practice. Pragmatic trials, conversely, aim to assess benefits, harms, and cost-effectiveness in real-world settings, and to identify subgroups for whom the intervention is most effective. Cluster Randomised Trials (CRTs) have emerged as an important design for pragmatic trials. However, there are a number of methodological challenges that must be addressed in order for CRTs to fulfil their promise, including enhancing their generalizability, reducing bias, improving precision, and the identification of individualized intervention effects. In this fellowship I will gain theoretical and practical training in the emerging area of machine learning (ML) to address the above challenges. To achieve this, I will:
1. Develop a framework to emulate CRTs from observational studies to evaluate the effect of cluster-level interventions and compare parametric and non-parametric methods, including ML approaches, to estimate inverse-probability-weights to address confounding;
2. Provide practical guidance for researchers on how to exploit ML to improve the selection and adjustment for covariates in CRTs, in order to both increase precision by reducing the intra-cluster correlation and to minimise bias by adjusting for confounding;
3. Propose and evaluate ML methods to study heterogeneity in intervention effects in CRTs and estimate subgroup effects while maintaining valid confidence intervals.
These objectives will be achieved by developing, extending and assessing ML methods using a mixture of theory, simulation studies and applications to case studies, including CRTs in pharmacy and psychiatry and observational data from Cancer Registries.

Planned Impact

During this fellowship, I will develop novel statistical methodology using a variety of methods, including machine learning (ML) to enhance the design and analysis of cluster randomised trials (CRTs). This research will focus in particular on reducing bias and increasing precision of CRTs, identifying subgroup of patients benefitting the most from the intervention and using external routinely collected data to better inform the design of CRTs.

Immediate beneficiaries of the project include researchers involved in planning and running CRTs, as well as statisticians with an interest in the methodology of CRTs. This research will provide tools and information which will enable researchers to harness the potential of ML to improve the efficiency of their trials.

Patients would also be major beneficiaries of this research. Designing more efficient CRTs and gaining statistical power with the use of ML algorithms can lead to trials conducted on smaller sample sizes, meaning that fewer patients would be exposed to potentially harmful interventions. Furthermore, smaller trials are usually shorter, which would reduce the delay between the start of the CRT and the moment the treatment or intervention studied is provided to the patients in practice. As such, patients would be treated more quickly. Smaller and shorter trials are also less costly, which will also be beneficial to trial funders, such as research councils and charities. My third objective, which focuses on the identification of subgroup effects, will be a step towards a more personalised medicine. Therefore, my research could allow patients to receive the optimal treatment based on their individual characteristics.

Through the development of a new framework for the emulation of CRTs from observational data, this work will take advantage of the wealth of medical information routinely collected to evaluate the effect of interventions in real-life settings, making use of existing and usually large data rather than collecting new information. This will allow researchers to obtain accurate results in a more timely manner but also replace expensive pilot CRTs usually conducted before the start of a larger-scale CRT.

Publications

10 25 50

publication icon
Besançon L (2021) Open science saves lives: lessons from the COVID-19 pandemic. in BMC medical research methodology

 
Description Giens Workshop
Geographic Reach Europe 
Policy Influence Type Participation in a guidance/advisory committee
URL https://www.ateliersdegiens.org/
 
Description Member of the DSMC of a stepped-wedge cluster trial
Geographic Reach Europe 
Policy Influence Type Participation in a guidance/advisory committee
URL https://clinicaltrials.gov/ct2/show/NCT03892148
 
Description Policy document on Covid19 and science
Geographic Reach Multiple continents/international 
Policy Influence Type Citation in other policy documents
Impact Our paper provides recommendations and options for policy action to improve the resilience of national science systems
 
Description ROBEST: Ensuring robustness of evidence in public health research for increased policy impact: widened use of advanced causal inference techniques
Amount £420,279 (GBP)
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 03/2022 
End 03/2025
 
Description Collaboration - Dr Sophie Pilleron 
Organisation Luxembourg Institute of Health
Country Luxembourg 
Sector Academic/University 
PI Contribution I contributed to the computing part of the project and performed a simulation study to evaluate the impact of immortal-time bias among older and younger cancer patients.
Collaborator Contribution The partners came up with the concept of the study, the hypotheses as well as the data used for the analysis.
Impact Pilleron S, Maringe C, Morris EJA, Leyrat C. Immortal-time bias in older vs younger age groups: a simulation study with application to a population-based cohort of patients with colon cancer. Br J Cancer. 2023 Feb 9. doi: 10.1038/s41416-023-02187-0. Multi-disciplinary collaboration involving epidemiologist (SP and EJAM)
Start Year 2021
 
Description Collaboration - Duplicate^2 
Organisation University of Grenoble
Country France 
Sector Academic/University 
PI Contribution This new collaboration aims to study the feasibility and reproducibility of target trial emulation (including cluster trial emulation) from the French SNDS data. I will provide methodological advice informed by my fellowship research.
Collaborator Contribution The partners are leading this working group and provide the resources to conduct the research.
Impact No output yet but grant application in preparation
Start Year 2023
 
Description Collaboration - SAP for CRTs 
Organisation University of Birmingham
Country United Kingdom 
Sector Academic/University 
PI Contribution Through this collaboration with multiple partners, I have contributed to the developement of new guidelines for the statistical analysis plan of cluster randomised trials
Collaborator Contribution The main collaborators drafted the original guidelines and organised a series of DELPHIs and consensus meetings to come up to the final version
Impact - Publication of the protocol: "Guidelines for the Content of Statistical Analysis Plans in Clinical Trials: Protocol for an Extension to Cluster Randomized Trials". Karla Hemming; Jacqueline Y Thompson; Richard L Hooper; Obioha C Ukoumunne; Fan Li; Agnes Caille; Brennan C Kahan; Clemence Leyrat; Micheal J Graylin; Nuredin I Mohammed; Jennifer A Thompson; Bruno Giraudeau; Elizabeth L Turner; Samuel I Watson; Beatriz P Goulão; Jessica Kasza; Andrew B Forbes; Andrew J Copas; Monica Taljaard. 2025. Trials. In Press
Start Year 2023
 
Title R package MatchThem 
Description This R package aims to facilitate the use of multiple imputation in propensity score matched analyses. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Not known yet. 
URL https://cran.r-project.org/web/packages/MatchThem/index.html
 
Description Expert panel member - Observational data for drug evaluation 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Involvement in a panel of experts providing recommendations on how observationnal studies should be used by medicine agencies for drug licensing
Year(s) Of Engagement Activity 2021,2024
 
Description INSERM workshop on cluster randomised trials 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Clémence Leyrat lead a session on cluster randomised trials at a professional workshop including around 40 participants, in Bordeaux June 2023
Year(s) Of Engagement Activity 2023
URL https://ateliersinserm.dakini-pco.com/en/workshop.274.cluster.randomized.trials.and.within.person.ra...
 
Description ISCB conference presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Following presentation at an international biostatistics conference:

LEYRAT C, DIAZORDAZ K, WILLIAMSON E. Covariate adjustment in randomised trials: when and how? 41th Annual Conference of the International Society for Clinical Biostatistics. August 2020, virtual conference.
Year(s) Of Engagement Activity 2020
URL https://iscb2020.info/
 
Description ISCB presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation at the 45th Annual Conderence of the International Society for Clinical Biostatistics.
Title: Emulation of target cluster trials of complex interventions: Estimands, methods and application
Year(s) Of Engagement Activity 2024
 
Description Live discussion on Open Science 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Participation in the live event "Open Science saves lives" on the channel LeGrandLabo on 29 September 2020.

This one hour discussion focussed on the need for better research practices (following the Open Science principles) during the COVID-19 pandemic.
Year(s) Of Engagement Activity 2020
URL https://www.youtube.com/watch?v=lFtsB-9E5EU
 
Description Poster - Cluster trial emulation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Poster presentation on Target Trial Emulation at the Cander Data Conference (CRUK) in Manchester, 27-28 February 2024
Year(s) Of Engagement Activity 2024
 
Description Presentation at Hopital st Louis 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Presentation on emulated trial (around 20 attendants) followed by constructive discussion on next steps and potential collaborations
Year(s) Of Engagement Activity 2022
 
Description SIOG webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Webinar entitled: Making the most of observational data to estimate causal effects using target trial emulation: practical examples

Audience: International Society of Geriatric Oncology
Year(s) Of Engagement Activity 2024
 
Description Seminar Target Trial Emulation - INSERM SHERE, Tours, France 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Invited presentation on Cluster Target Trial emulation to ~20 researchers and Phd students
Year(s) Of Engagement Activity 2024
 
Description Webinar QuanTIM 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This webinar focused on emulated trials for observational studies and presented some of the work conducted during my fellowship. This led to my invitation to be a keynote speaker in an upcoming conference of the French Society for Pharmacology.
Year(s) Of Engagement Activity 2022
URL https://sesstim.univ-amu.fr/fr/content/webinar-quantim-clemence-leyrat
 
Description Website on cluster randomised trials 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Development and maintenance of a website dedicated to cluster randomised trials, hosted by QMUL, London:
https://clusterrandomisedtrials.qmul.ac.uk/who-we-are/

The aim is to help clinicians, trialists and statisticians design high quality cluster randomised trials.
Year(s) Of Engagement Activity 2020,2021
URL https://clusterrandomisedtrials.qmul.ac.uk/who-we-are/