The Regression Discontinuity Design: a novel approach to evaluating the effect of drugs and treatments in primary care

Lead Research Organisation: University College London

Department Name: Statistical Science

Abstract

A fundamental task in clinical practice is to determine whether a particular drug is being prescribed in the most effective way, in order to increase the clinical benefit derived from its use. While Randomised Clinical Trials (RCTs) are correctly considered to be the best scientific method for evaluation of drug efficacy, these studies often have poor external validity, because of patient selection, in particular the avoidance of comorbidities. Prescription guidelines are not always evidence based and it typically falls to clinical experts to set them.
The regression discontinuity design (RDD) is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting naturally occurring treatment rules. It was first introduced in the educational economics literature in the 1960s but it has not been widely used outside of this field until recently. The RDD exploits the fact that many treatments are assigned according to pre-decided rules, eg those set by NICE in the UK.
The idea behind the RDD is that if we can assume that individuals just on either side of a pre-selected threshold (eg blood pressure of 140/90mmHg) belong to a common population with respect to the characteristics that inform the assignment rule and determine the outcome, then the threshold can be seen as a random intervention which assigns the treatment to those that fall just above and no treatment to those that fall just below it. Due to the quasi-randomised nature of the RDD, confounding is eliminated or at least mitigated. We can thus use the RDD to estimate the effect of the treatment or a related exposure within the population around the threshold. However, although the RDD mimics randomisation, in our particular case, in order to reduce bias of the estimates it will still be necessary to further account in the analyses for other factors which may impact the decision to prescribe, including markers of patients frailty or comorbidities.
In situations where the set up is fully adhered to (ie when all individuals above (below) the threshold are (not) given the treatment, the RDD is termed "sharp". Conversely, in cases where well-defined guidelines exist, but for some (possibly completely legit, under a clinical perspective) reason they are not fully followed, the RDD is termed "fuzzy".
RDD estimators have strong connections with other causal inference estimators in statistical research. However, in our case, we are faced with the complication that often in epidemiology and clinical research the outcome of interest is represented by a binary variable (eg the occurrence of a clinical event, or mortality). Specific methods for such a case have not been extensively developed, although Instrumental Variable (IV) theory can be brought to bear. Specifically, the main difficulty lies in the fact that in this case the interest is in causal ratios, rather than causal differences (which are relevant in the case of continuous outcomes). Methods that can be used to derive such estimators are based on making additional assumptions in addition to the standard IV setting. Assumption A1) involves approximate linearity between a function of the outcome and the treatment and unobserved confounders; and A2) assumes no interaction between the treatment and the unobserved confounders (termed "no-effect modification on the multiplicative scale"). These cannot be directly tested (because they involve unobservable variables), but we will investigate their plausibility using simulations.
Electronic healthcare records such as "The Health Improvement Network" (THIN) primary care database provide an excellent opportunity to investigate questions like those mentioned above. From THIN we can obtain detailed information on prescriptions made in primary care and measurements both before and after initiation of prescription, as well as adverse events.

Technical Summary

The regression discontinuity design (RDD) is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting treatment rules. It can be applied in any context where a particular treatment is administered according to a pre-defined rule linked to a continuous variable. Such thresholds exist in primary care in the context of drug prescription and we focus on the examples of statin and insulin prescription. The idea is that if we can assume that individuals on either side of a threshold belong to a common population with respect to the characteristics informing the assignment rule and determining the outcome, then the threshold can be seen as a random intervention assigning the treatment to those falling just above the threshold and no treatment to those just below it. Regressions are typically used to predict the value of the outcome on either side of the threshold. We will use regression methods developed in econometrics such as local linear regression as well as more sophisticated regression models if the relationships we observe are non-linear. One methodological issue is represented by the fact that in most clinical cases, it is likely that the threshold is not strictly adhered to. This is due to two levels of non-compliance: 1) whether GPs follow the prescription guidelines (GP decision making) and 2) whether patients who are prescribed a drug take it (patient prescription adherence). Since the RDD has several mathematical similarities with Instrumental Variables (IVs), we will exploit the econometric literature to produce our estimations of the treatment effects. However, since we will be concerned with binary outcomes, it will be necessary to expand standard IV methods. In particular, we will use a Bayesian approach; this will be beneficial, since methods to deal with weak instruments, which can be adapted to the RDD when the guidelines are loosely adhered to, are best handled within a Bayesian framework.

Planned Impact

Our proposal originates from the consideration that, while RCTs are a fundamental tool to evaluate health interventions, their limitations in terms of external validity imply that there is increasingly the need for bridging from trials to the real world, which is also based on data provided by observational settings. However, for this to be realised, data collected in primary and secondary care need to be suitably processed to be comparable with those obtained by RCTs. Our methodological advances will potentially allow for research that will enable clinicians and decision makers in health care to revise the existing rules for the provision of a given intervention.
The obvious example is represented by the possibility of revising existing guidelines for prescription of a given drug (eg statins or insulin) according to arbitrarily set thresholds. These are currently based on clinical expert opinion and evidence from RCTs. The RDD could help integrate and complement this information with evidence coming directly from clinical practice and potentially help optimise the point in the disease progression where drugs are administered to patients.
It is also possible to envisage a future where, following the approval of drugs as safe and effective for general prescription through RCT evidence, a second stage of evaluations based in primary care using the RDD are commissioned to determine optimal guidelines for prescription. While we do not attempt to tackle this within the current proposal, we believe that this research can be relatively easily extended to cover cost-effectiveness evaluations of a given health intervention. NICE and the NHS in general would be the obvious beneficiaries of this.
From the more substantive point of view, the practical implications of our proposal impact on extremely clinically relevant areas: statins represent one of the most prescribed drugs in the UK, accounting for over £560 million of NHS expenditure in 2010; moreover, the evidence base is well established, which means that we will be able to contrast our results with a reasonably robust theory. On the other hand, there is still substantial uncertainty as to whether insulin treatment is the most effective and cost-effective in the treatment of type II diabetes, especially with respect to the point at which treatment should be initiated.
We will disseminate the work through press offices in our institutions, together with the MRC press office to reach a wider audience.

Funded Value:

£316,786

Funded Period:

Sep 13 - Sep 16

Funder:

MRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

MR/K014838/1

Principal Investigator:

Gianluca Baio

Health Category:

Unclassified

Organisations

People	ORCID iD
Gianluca Baio (Principal Investigator)
Alexander Dawid (Co-Investigator)
Sara Gisella Geneletti Inchauste (Co-Investigator)
Irwin Nazareth (Co-Investigator)
Sylvia Richardson (Co-Investigator)
Linda Sharples (Co-Investigator)
Nick Freemantle (Co-Investigator)
Irene Petersen (Co-Investigator)
Richard Morris (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Adeleke M (2022) Regression Discontinuity Designs for Time-to-Event Outcomes: An Approach using Accelerated Failure Time Models in Journal of the Royal Statistical Society Series A: Statistics in Society

Constantinou, P (2015) Regression discontinuity designs: A decision-theoretic approach

Geneletti S (2015) Bayesian regression discontinuity designs: incorporating clinical knowledge in the causal analysis of primary care data. in Statistics in medicine

Geneletti S (2019) Bayesian Modelling for Binary Outcomes in the Regression Discontinuity Design in Journal of the Royal Statistical Society Series A: Statistics in Society

O'Keeffe A (2014) Regression discontinuity designs: an approach to the evaluation of treatment efficacy in primary care using observational data in BMJ

O'Keeffe AG (2016) Time trends in the prescription of statins for the primary prevention of cardiovascular disease in the United Kingdom: a cohort study using The Health Improvement Network primary care data. in Clinical epidemiology

O'Keeffe AG (2016) Approaches to the Estimation of the Local Average Treatment Effect in a Regression Discontinuity Design. in Scandinavian journal of statistics, theory and applications

O'Keeffe AG (2015) Initiation rates of statin therapy for the primary prevention of cardiovascular disease: an assessment of differences between countries of the UK and between regions within England. in BMJ open

Petersen I (2020) Impact of Being Eligible for Type 2 Diabetes Treatment on All-Cause Mortality and Cardiovascular Events: Regression Discontinuity Design Study. in Clinical epidemiology

Ricciardi F (2023) Dirichlet process mixture models for regression discontinuity designs. in Statistical methods in medical research

Further Funding
Research Databases and Models
Collaboration
Software and Technical Products
Engagement Activities


Description	Evaluating Policy Implementations TO Predict MEntal health [EPITOME]: a Bayesian hierarchical framework for quasi-experimental designs in longitudinal settings
Amount	£663,703 (GBP)
Funding ID	222499/Z/21/Z
Organisation	Wellcome Trust
Sector	Charity/Non Profit
Country	United Kingdom
Start	10/2021
End	09/2025


Title	Danish data
Description	We are liaising with Danish colleagues to include their extensive population registry as part of our work. This is extremely interesting as it will allow us to use a comprehensive database as well as international comparison to our own THIN dataset (covering England and Wales).
Type Of Material	Database/Collection of data
Provided To Others?	No
Impact	We will be able to draw international comparisons as well as producing a more precise assessment/modelling.


Description	Collaboration with Danish researchers
Organisation	Aarhus University
Country	Denmark
Sector	Academic/University
PI Contribution	We will collaborate on developing Regression Discontinuity Designs (both under a Bayesian and a frequentist approach) using data specific to Denmark. We will be involved in the design of observational studies based on registry data, in which suitable methods (developed in our project) will be applied to perform causal inference.
Collaborator Contribution	We have initiated a collaboration with the Klinisk Epidemiologisk Afdeling in Aarhus. We will be able to use data from Danish registries as part of our case study on diabetes. This will complement the data from the UK THIN dataset.
Impact	We are currently working on data extraction and planning of the analysis.
Start Year	2015


Description	Collaboration with the University of Bristol
Organisation	University of Bristol
Country	United Kingdom
Sector	Academic/University
PI Contribution	We have worked in collaboration with researchers at the University of Bristol to join up our research on the regression discontinuity design and formal development in statistical methodology in the area of causal inference from a decision-theoretic perspective. We have introduced our colleagues to the specific concepts and details of the RDD and have collaborated on a research paper.
Collaborator Contribution	Our colleagues have participated in meetings and worked on a research paper, bringing their expertise in statistical theoretical aspects of causal inference from a decision theoretic point of view.
Impact	There is a working paper, which has been also submitted for publication to the Journal of the Royal Statistical Society
Start Year	2015


Title	R package for RDD analysis
Description	We are producing a R package to include our main methods and analytic processes in terms of the analysis of regression discontinuity design, particularly under a Bayesian approach.
Type Of Technology	Software
Year Produced	2016
Open Source License?	Yes
Impact	The software production is ongoing.


Description	End of grant Workshop
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	This workshop aimed at discussing recent methodological issues and applications of the regression discontinuity design (RDD). The RDD is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting naturally occurring or externally imposed treatment rules. In this workshop, we aimed at mixing and cross-fertilise the perspectives of Economists, Statisticians and Epidemiologists. The workshop consisted of different sessions mixing speakers broadly coming from an Economics background with speakers broadly coming from a statistical one. The workshop is jointly organised by the UCL Department of Statistical Science and CEMMAP. It also served as an end-of-project event for an MRC-funded research project (MR/K01438/1) aimed at exploring the application of the RDD to evaluating the effects of drugs and treatments in primary care.
Year(s) Of Engagement Activity	2017
URL	https://www.ucl.ac.uk/statistics/research/statistics-health-economics/workshop_rdd