The Regression Discontinuity Design: a novel approach to evaluating the effect of drugs and treatments in primary care

Lead Research Organisation: University College London
Department Name: Statistical Science


A fundamental task in clinical practice is to determine whether a particular drug is being prescribed in the most effective way, in order to increase the clinical benefit derived from its use. While Randomised Clinical Trials (RCTs) are correctly considered to be the best scientific method for evaluation of drug efficacy, these studies often have poor external validity, because of patient selection, in particular the avoidance of comorbidities. Prescription guidelines are not always evidence based and it typically falls to clinical experts to set them.
The regression discontinuity design (RDD) is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting naturally occurring treatment rules. It was first introduced in the educational economics literature in the 1960s but it has not been widely used outside of this field until recently. The RDD exploits the fact that many treatments are assigned according to pre-decided rules, eg those set by NICE in the UK.
The idea behind the RDD is that if we can assume that individuals just on either side of a pre-selected threshold (eg blood pressure of 140/90mmHg) belong to a common population with respect to the characteristics that inform the assignment rule and determine the outcome, then the threshold can be seen as a random intervention which assigns the treatment to those that fall just above and no treatment to those that fall just below it. Due to the quasi-randomised nature of the RDD, confounding is eliminated or at least mitigated. We can thus use the RDD to estimate the effect of the treatment or a related exposure within the population around the threshold. However, although the RDD mimics randomisation, in our particular case, in order to reduce bias of the estimates it will still be necessary to further account in the analyses for other factors which may impact the decision to prescribe, including markers of patients frailty or comorbidities.
In situations where the set up is fully adhered to (ie when all individuals above (below) the threshold are (not) given the treatment, the RDD is termed "sharp". Conversely, in cases where well-defined guidelines exist, but for some (possibly completely legit, under a clinical perspective) reason they are not fully followed, the RDD is termed "fuzzy".
RDD estimators have strong connections with other causal inference estimators in statistical research. However, in our case, we are faced with the complication that often in epidemiology and clinical research the outcome of interest is represented by a binary variable (eg the occurrence of a clinical event, or mortality). Specific methods for such a case have not been extensively developed, although Instrumental Variable (IV) theory can be brought to bear. Specifically, the main difficulty lies in the fact that in this case the interest is in causal ratios, rather than causal differences (which are relevant in the case of continuous outcomes). Methods that can be used to derive such estimators are based on making additional assumptions in addition to the standard IV setting. Assumption A1) involves approximate linearity between a function of the outcome and the treatment and unobserved confounders; and A2) assumes no interaction between the treatment and the unobserved confounders (termed "no-effect modification on the multiplicative scale"). These cannot be directly tested (because they involve unobservable variables), but we will investigate their plausibility using simulations.
Electronic healthcare records such as "The Health Improvement Network" (THIN) primary care database provide an excellent opportunity to investigate questions like those mentioned above. From THIN we can obtain detailed information on prescriptions made in primary care and measurements both before and after initiation of prescription, as well as adverse events.

Technical Summary

The regression discontinuity design (RDD) is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting treatment rules. It can be applied in any context where a particular treatment is administered according to a pre-defined rule linked to a continuous variable. Such thresholds exist in primary care in the context of drug prescription and we focus on the examples of statin and insulin prescription. The idea is that if we can assume that individuals on either side of a threshold belong to a common population with respect to the characteristics informing the assignment rule and determining the outcome, then the threshold can be seen as a random intervention assigning the treatment to those falling just above the threshold and no treatment to those just below it. Regressions are typically used to predict the value of the outcome on either side of the threshold. We will use regression methods developed in econometrics such as local linear regression as well as more sophisticated regression models if the relationships we observe are non-linear. One methodological issue is represented by the fact that in most clinical cases, it is likely that the threshold is not strictly adhered to. This is due to two levels of non-compliance: 1) whether GPs follow the prescription guidelines (GP decision making) and 2) whether patients who are prescribed a drug take it (patient prescription adherence). Since the RDD has several mathematical similarities with Instrumental Variables (IVs), we will exploit the econometric literature to produce our estimations of the treatment effects. However, since we will be concerned with binary outcomes, it will be necessary to expand standard IV methods. In particular, we will use a Bayesian approach; this will be beneficial, since methods to deal with weak instruments, which can be adapted to the RDD when the guidelines are loosely adhered to, are best handled within a Bayesian framework.

Planned Impact

Our proposal originates from the consideration that, while RCTs are a fundamental tool to evaluate health interventions, their limitations in terms of external validity imply that there is increasingly the need for bridging from trials to the real world, which is also based on data provided by observational settings. However, for this to be realised, data collected in primary and secondary care need to be suitably processed to be comparable with those obtained by RCTs. Our methodological advances will potentially allow for research that will enable clinicians and decision makers in health care to revise the existing rules for the provision of a given intervention.
The obvious example is represented by the possibility of revising existing guidelines for prescription of a given drug (eg statins or insulin) according to arbitrarily set thresholds. These are currently based on clinical expert opinion and evidence from RCTs. The RDD could help integrate and complement this information with evidence coming directly from clinical practice and potentially help optimise the point in the disease progression where drugs are administered to patients.
It is also possible to envisage a future where, following the approval of drugs as safe and effective for general prescription through RCT evidence, a second stage of evaluations based in primary care using the RDD are commissioned to determine optimal guidelines for prescription. While we do not attempt to tackle this within the current proposal, we believe that this research can be relatively easily extended to cover cost-effectiveness evaluations of a given health intervention. NICE and the NHS in general would be the obvious beneficiaries of this.
From the more substantive point of view, the practical implications of our proposal impact on extremely clinically relevant areas: statins represent one of the most prescribed drugs in the UK, accounting for over £560 million of NHS expenditure in 2010; moreover, the evidence baseis well established, which means that we will be able to contrast our results with a reasonably robust theory. On the other hand, there is still substantial uncertainty as to whether insulin treatment is the most effective and cost-effective in the treatment of type II diabetes, especially with respect to the point at which treatment should be initiated.
We will disseminate the work through press offices in our institutions, together with the MRC press office to reach a wider audience.
Title Danish data 
Description We are liaising with Danish colleagues to include their extensive population registry as part of our work. This is extremely interesting as it will allow us to use a comprehensive database as well as international comparison to our own THIN dataset (covering England and Wales). 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact We will be able to draw international comparisons as well as producing a more precise assessment/modelling. 
Description Collaboration with Danish researchers 
Organisation Aarhus University
Country Denmark 
Sector Academic/University 
PI Contribution We will collaborate on developing Regression Discontinuity Designs (both under a Bayesian and a frequentist approach) using data specific to Denmark. We will be involved in the design of observational studies based on registry data, in which suitable methods (developed in our project) will be applied to perform causal inference.
Collaborator Contribution We have initiated a collaboration with the Klinisk Epidemiologisk Afdeling in Aarhus. We will be able to use data from Danish registries as part of our case study on diabetes. This will complement the data from the UK THIN dataset.
Impact We are currently working on data extraction and planning of the analysis.
Start Year 2015
Description Collaboration with the University of Bristol 
Organisation University of Bristol
Country United Kingdom 
Sector Academic/University 
PI Contribution We have worked in collaboration with researchers at the University of Bristol to join up our research on the regression discontinuity design and formal development in statistical methodology in the area of causal inference from a decision-theoretic perspective. We have introduced our colleagues to the specific concepts and details of the RDD and have collaborated on a research paper.
Collaborator Contribution Our colleagues have participated in meetings and worked on a research paper, bringing their expertise in statistical theoretical aspects of causal inference from a decision theoretic point of view.
Impact There is a working paper, which has been also submitted for publication to the Journal of the Royal Statistical Society
Start Year 2015
Title R package for RDD analysis 
Description We are producing a R package to include our main methods and analytic processes in terms of the analysis of regression discontinuity design, particularly under a Bayesian approach. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact The software production is ongoing. 
Description End of grant Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This workshop aimed at discussing recent methodological issues and applications of the regression discontinuity design (RDD). The RDD is an econometric quasi-experimental design aimed at estimating the causal effects of a treatment by exploiting naturally occurring or externally imposed treatment rules. In this workshop, we aimed at mixing and cross-fertilise the perspectives of Economists, Statisticians and Epidemiologists. The workshop consisted of different sessions mixing speakers broadly coming from an Economics background with speakers broadly coming from a statistical one. The workshop is jointly organised by the UCL Department of Statistical Science and CEMMAP. It also served as an end-of-project event for an MRC-funded research project (MR/K01438/1) aimed at exploring the application of the RDD to evaluating the effects of drugs and treatments in primary care.
Year(s) Of Engagement Activity 2017