Methodology to support the development of prognostic models that incorporate and inform the observation processes within electronic health records

Lead Research Organisation: University of Manchester
Department Name: School of Health Sciences


There is substantial interest in using the abundance of data available through patients' electronic health records (EHRs) to develop clinical prediction models (CPMs) that predict whether a person will have an event at some future time point based on what we know about them now [1,2]. CPMs can support both evidence-based decision-making and the recent drive for precision medicine, but challenges remain when modelling within EHRs. Specifically, EHRs usually contain longitudinal information about a patient, through repeated contact with health services, but the observation times and frequency of measurement will vary considerably within and across patients.

For example, suppose a CPM uses lab results and blood pressure to predict the real-time risk of a patient being transferred to an intensive care unit (ICU), thereby facilitating early warning during regular hospital visits. While a patients' EHR might contain regular blood pressure measurements, their lab results might be observed less frequently. Rather than trying to impute missing risk factors (e.g. lab results) at a given time point, this project will develop methods that allow the CPM to make real-time predictions in their absence (e.g. predict using only blood pressure), and then potentially 'request' additional information for certain patients. Such 'interactive measurement' would facilitate targeted high-frequency data collection in those at high-risk of adverse outcome (e.g. deterioration towards ICU transfer).

Likewise, if the additional information (e.g. lab results) were available a-priori, then this would reflect the clinician's beliefs about a patient since the observation frequency contains information on the patient's underlying health status - so-called informative observation [3]. Incorporating the observation process and the outcome process within CPMs is rarely considered within CPMs, but could allow them to learn from clinical judgements [4,5].

Therefore, this project will: (i) explore the current approaches of incorporating observation processes within CPM development, (ii) develop methods that allow CPMs to handle heterogeneous risk factor measurement frequency, while concurrently informing interactive measurements, and (iii) study the relationships between interactive measurement and informative observation.

For clinical examples, we intend to focus on: 1) discharge versus care escalation in a critical care (hospital) context, and 2) prediction of disease incidence in primary care.


10 25 50