Identifying subtypes of Alzheimer's disease in Electronic Health Records

Lead Research Organisation: University College London

Department Name: Institute of Health Informatics

Abstract

Patients with Alzheimer's Disease (AD) display large variation in symptom presentation, rate of progression and commodities. As the number of disease factors involved increases it becomes harder to ascertain the specific role they have in effecting the disease and how it should be treated, especially when having to consider the interactions of all the other disease factors. Examining hidden patterns in these disease factors can help untangle their relationship with AD progression and treatment. One type of pattern that can be identified is distinct clusters of patients with similar patterns of these disease factors. Through representing AD heterogeneity in this way, it offers the opportunity for unique insights about the disease to be made.

This research uses several different cluster analysis methods to examine and validate subtypes of AD using electronic health records (EHR). Using EHR means that a variety of clinical attributes about the patient can be used in the analysis. The outcomes can be directly lifted, and are therefore relatable to the patients' experience in NHS. This research will first find AD subtypes based on symptoms, then it will expand to include a variety of comorbidities, to examine how the subtypes differ in rate of progression and other factors.

Oct 17 - Dec 21

Funder:

MRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

1940103

Health Category:

Unclassified

Organisations

University College London (Lead Research Organisation)

People	ORCID iD
Spiros Denaxas (Primary Supervisor)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Williamson E (2019) Risk of mortality and cardiovascular events following macrolide prescription in chronic rhinosinusitis patients: a cohort study using linked primary care electronic health records. in Rhinology

Uijl A (2019) Risk factors for incident heart failure in age- and sex-specific strata: a population-based cohort using linked electronic health records. in European journal of heart failure

Shah S (2020) Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure. in Nature communications

Shah S (2019) Genome-wide association study provides new insights into the genetic architecture and pathogenesis of heart failure

Shah AD (2019) Natural language processing for disease phenotyping in UK primary care records for research: a pilot study in myocardial infarction and death. in Journal of biomedical semantics

Schmidt AF (2019) Phenome-wide association analysis of LDL-cholesterol lowering genetic variants in PCSK9. in BMC cardiovascular disorders

Rafiq M (2019) Socioeconomic deprivation and regional variation in Hodgkin's lymphoma incidence in the UK: a population-based cohort study of 10 million individuals. in BMJ open

Rafiq M (2020) Allergic disease, corticosteroid use, and risk of Hodgkin lymphoma: A United Kingdom nationwide case-control study. in The Journal of allergy and clinical immunology

Pikoula M (2019) Identifying clinically important COPD sub-types using data-driven approaches in primary care population based electronic health records. in BMC medical informatics and decision making

Philpott C (2019) Clarithromycin and endoscopic sinus surgery for adults with chronic rhinosinusitis with and without nasal polyps: study protocol for the MACRO randomised controlled trial. in Trials

Pathak Neha (2019) Validity of using UK primary care electronic health records to study migration and health: a population-based cohort study in LANCET

Pasea L (2019) Bleeding in cardiac patients prescribed antithrombotic drugs: electronic health record phenotyping algorithms, incidence, trends and prognosis. in BMC medicine

Pasea L (2019) Bleeding in cardiac patients prescribed antithrombotic drugs: Electronic health record phenotyping algorithms, incidence, trends and prognosis

McMahon C (2019) A novel metadata management model to capture consent for record linkage in longitudinal research studies. in Informatics for health & social care

Hopkins C (2019) Antibiotic usage in chronic rhinosinusitis: analysis of national primary care electronic health records. in Rhinology

Hingorani AD (2019) Improving the odds of drug development success through human genomics: modelling study. in Scientific reports

Hingorani A (2017) Flipping the odds of drug development success through human genomics

Henry A (2019) The relationship between sleep duration, cognition and dementia: a Mendelian randomization study. in International journal of epidemiology

Hemingway H (2017) Using nationwide 'big data' from linked electronic health records to help improve outcomes in cardiovascular diseases: 33 studies using methods from epidemiology, informatics, economics and social science in the ClinicAl disease research using LInked Bespoke studies and Electronic health Records (CALIBER) programme in Programme Grants for Applied Research

Farmer RE (2019) Associations Between Measures of Sarcopenic Obesity and Risk of Cardiovascular Disease and Mortality: A Cohort Study and Mendelian Randomization Analysis Using the UK Biobank. in Journal of the American Heart Association

Dickerman BA (2019) Avoidable flaws in observational analyses: an application to statins and cancer. in Nature medicine

Denaxas S (2019) Analyzing the heterogeneity of rule-based EHR phenotyping algorithms in CALIBER and the UK Biobank

Denaxas S (2019) Phenotyping UK Electronic Health Records from 15 Million Individuals for Precision Medicine: The CALIBER Resource. in Studies in health technology and informatics

Denaxas S (2019) UK phenomics platform for developing and validating EHR phenotypes: CALIBER

Denaxas S (2019) UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. in Journal of the American Medical Informatics Association : JAMIA

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
MR/R502248/1			01/10/2017	30/09/2021
1940103	Studentship	MR/R502248/1	01/10/2017	30/12/2021

Further Funding
Research Tools and Methods
Engagement Activities


Description	Defining and redefining human disease at scale - the human phenome project (GSK)
Amount	£851,000 (GBP)
Organisation	GlaxoSmithKline (GSK)
Sector	Private
Country	Global
Start	01/2020
End	01/2021


Title	Monte Carlo Method for Cluster Evaluation for EHR
Description	This is a tool to evaluate the structure of clusters found using unsupervised machine learning methods in Electronic health records based on comparison of a monte carlo generated null distribution.
Type Of Material	Improvements to research infrastructure
Year Produced	2020
Provided To Others?	No
Impact	This tool will help improve the validity of patient subtypes found in clustering studies in EHR, thus increasing the likelihood that these subtypes could be clinically useful.


Title	Phenome-wide phenotyping algorithms
Description	Machine-readable versions (CSV files) of electronic health record phenotyping algorithms for Kuan V., Denaxas S., Gonzalez-Izquierdo A. et al. A chronological map of 308 physical and mental health conditions from 4 million individuals in the National Health Service published in the Lancet Digital Health - DOI 10.1016/S2589-7500(19)30012-3
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	Algorithms are being used in two additional projects: 1) Pathak N. et al Migrant EHR and 2) Denaxas S. et al GSK/phenomics
URL	https://github.com/spiros/chronological-map-phenotypes


Title	Synthetic EHR for Cluster Benchmarking
Description	This tool is a wrapper for the synthetic health record generator SYNTHEA which transforms the data to useable and realistic health records that has distinct and motifiable parameters such as cluster number and cluster seperation, as well as realistic patient outcomes that can be used by researchers to benchmark methods for finding patient subgroups.
Type Of Material	Improvements to research infrastructure
Year Produced	2020
Provided To Others?	No
Impact	This method will help researchers validate future tools for patient subtyping, and allow potential users of those methods to understand in greater detail the benefits of those methods.


Title	tofu
Description	Tofu is a Python library for generating synthetic UK Biobank data. The UK Biobank is a large open-access prospective research cohort study of 500,000 middle aged participants recruited in England, Scotland and Wales. The study has collected and continues to collect extensive phenotypic and genotypic detail about its participants, including data from questionnaires, physical measures, sample assays, accelerometry, multimodal imaging, genome-wide genotyping and longitudinal follow-up for a wide range of health-related outcomes. Tofu will generate synthetic data which conform to the structure of the baseline data UK Biobank sends researchers by generating random values: For categorical variables (single or multiple choices), a random value will be picked from the UK Biobank data dictionary for that field. For continous variables, a random value will be generated based on the distribution of values reported for that field on the UK Biobank showcase. For date and date/time fields, a random date will be generated. For all other fields, such as polymorphic fields, no data will be generated. Some general observations: The lookups directory contains lookups downloaded from the UK Biobank showcase - they might need to be updated when new fields become available. Data conform to the structure and schema of the baseline file but are otherwise nonsensical: no checks have been implemented across fields. All eid's (patient identifiers) generated from this tool are prefaced with 'fake' in order to avoid confusion with legitimate datasets. Dates randomly generated are between 1910 and 1990 again to avoid confusion with real data.
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	Data has been used for training purposes at a postgraduate and postdoc level.
URL	https://github.com/spiros/tofu


Description	MRC Methodology Research Panel
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Invited guest member on MRC Methodology Research Panel
Year(s) Of Engagement Activity	2018,2019,2020


Description	Organisation of Precision medicine Panal discussion in conjection with HDRUK
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	I lead the organisation of a panal discussion hosted at the Wellcome trust in conjunction with HDRUK where industry professionals, academics and policy makers discussed how to properly harness the potential of precision medicine in the NHS. Aimed at PhD students.
Year(s) Of Engagement Activity	2020


Description	Organised a Discussion panal on AI in healthcare in conjunction with HDRUK
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	I lead the organisation of a panal discussion hosted at the Wellcome trust where industry professionals, academics and policy makers discussed the holdbacks of AI in the NHS. Aimed at PhD students.
Year(s) Of Engagement Activity	2019


Description	UK Science & Innovation Network & NIH Maternal Health & AI Research Symposium
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Invited to speak at the Maternal Health & AI Research Symposium in Boston, MA, USA.
Year(s) Of Engagement Activity	2020


Description	Wellcome Innovations Flagships
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Member of the Innovations Flagships panel. Innovations Flagships support the development of exciting new products, technologies and other interventions to prevent or treat disease.
Year(s) Of Engagement Activity	2019,2020

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects