Phenotyping Patient Trajectories in Complex Disease

Lead Research Organisation: University of Oxford

Abstract

Clinical systems such as Early Warning Scores (EWS) are useful for hospitals to determine the degree of illness of a patient.Nevertheless, these systems fare behind more complex models in risk prediction with regards to unwelcome outcomes such as exacerbations, entry to ICU and death [1].Furthermore, these systems are currently not capable of predicting patient trajectories and or providing personal monitoring recommendations for medical intervention (such as next laboratory test, drug prescription, procedure and treatment);tools which can be valuable in helping clinicians and hospitals improve inpatients health.To bridge this gap, we aim to develop clustering methodologies on multivariate, multi-modal patient vitalsign observations and Electronic Health Records (EHR) information (including laboratory test results, medications, blood tests, cohort and demographic data) that are capable of predicting patient trajectories,ultimately, deliver personal monitoring recommendations to boost patient well-being. Given data heterogeneity, our models also learn representative trajectories for each subtype detected. These allow for a homogeneous framework of comparison between distinct clusters and provide a visually interpretable representation of patient future evolution within each cluster. Furthermore, as in [2], a 'nearest-representative' approach is taken for new patient cluster allocation and prediction. Additionally, we look to combine outcome information to associate clusters and trajectories with the overarching overall outcome. Finally, we extend our methods to learn optimal recommendations for individual patients based on an approach similar to differential diagnosis - when considering a new patient, our models recommend the best intervention based on the previous clinical interventions that brought best improvement in similar sub-type.These methods will be trained and tested on the HAVEN database (REC reference: 16/SC/0264 and Confidential Advisory Group reference 08/02/1394.), a database with vital-sign, demographic, outcome, medication,laboratory, diagnosis codes, and commorbidities information on patient journeys during hospital stays on more than 200, 000 patients and 4, 000, 000 observations. HAVEN contains data from 4 hospitals' admissions across Oxford, which will allow models to learn from different hospital environments, thus increasing generalization.Furthermore, we will also test our methods on other public datasets such as MIMIC [3] and e-ICU [4].Our current work focuses on using vital-sign features, as well as cohort and outcome information. Extra information on drug prescription, blood tests ('bloods') and laboratory tests can be used to increase both prediction performance and interepretability of derived clusters, as can be seen in [5] and [6]. We propose to incorporate these features as part of our framework, building on their availability in the HAVEN dataset. To incorporate these extra features, we will develop models suited to the different types of input features and able of detecting changes to patient trajectories due to medical intervention - drug administration interferes naturally with patient vital-signs. Similarly, blood test results can be used to infer latent issues or further subtypes in the patient population.Additionally, we look to detect and incorporate other event types in our pipeline. Knowledge on further outcomes will increase the subtypes we are capable of learning and ensure more separate phenotypes of representative trajectories and increase the prediction performance of our models. In our work in Chronic Obstructive Pulmonary Disease (COPD), patients with T2RF failure represent a small but distinct subtype of COPD patients. This subtype will ideally be detected given their response to standard Oxygen supply treatments.Our research methodologies fit well
within the "Statistics and Applied Probability" theme of the "Mathematical Sciences" research area.

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2271697 Studentship EP/S02428X/1 01/10/2019 19/04/2024 Henrique Aguiar