Enhancing cardiovascular disease risk prediction through using longitudinal measures of risk factors in clustered data
Lead Research Organisation:
University of Cambridge
Department Name: Public Health and Primary Care
Abstract
WHY IS CARDIOVASCULAR DISEASE IMPORTANT?
Cardiovascular disease (CVD), comprising mostly heart attacks and strokes, is UK's leading cause of death and disability.
WHAT IS CVD RISK ASSESSMENT?
Measurement of "risk factors" that are linked with greater risk of CVD (eg, older age, male sex, smoking, high blood pressure, low and high body mass index, high cholesterol) can identify people who may especially benefit from preventive action, such as lifestyle advice and/or medications. Various risk prediction models are used to summarise levels of several risk markers simultaneously and are used to classify patients into high/medium or low risk groups. All CVD risk prediction models are based on single measures of risk factors.
WHAT IS THE HEALTH SERVICE ALREADY DOING?
In 2008, the UK government announced a national initiative to conduct CVD risk assessment in people aged 40-74 years with no history of CVD. The checks are being done in general practices and pharmacies.
WHY IS RESEARCH NEEDED?
Although experts generally agree that CVD risk assessment has the potential to prevent disease, there is uncertainty about:
- do past measures of risk factors or changes in risk factors yield useful information in addition to current measures of risk factors?
- how can past and current measures of risk factors be incorporated into a statistical model?
- how should information on risk, when incorporating past and current measures of risk factors, best be summarised?
WHAT IS THE PROPOSED RESEARCH?
An interdisciplinary team of internationally recognised statisticians and epidemiologists will:
1. analyse detailed scientific databases with information on CVD risk factors measured more than once in patients with and without heart attacks or strokes in a total of about 4 million participants, thereby reliably determining the use of repeated measures of risk factors for CVD risk assessment;
2. use these databases to identify which CVD risk factors provide useful extra information when measured on more than one occasion for CVD risk assessment;
3. determine suitable statistical models that combine information from past and current measures of risk factors in order to provide up-to-date relevant CVD risk assessments for patients;
4. investigate the use of interactive technology to provide graphical displays of individual CVD risk over time.
WHAT DATA SOURCES ARE AVAILABLE FOR THIS RESEARCH?
We have two large databases for our investigations:
Emerging risk factors collaboration
The Emerging Risk Factors Collaboration has collated individual data on up to 500 characteristics in over 1.8 million participants in over 120 long-term studies. Within these data, approximately 280,000 participants have at least two measurements of the main CVD risk factors.
The Health Improvement Network
The Health Improvement Network (THIN) is a primary care database with anonymised clinical records entered by general practices on their computer computer system for patient management (ViSion). THIN now includes records from around 4 million people with repeat measures of risk factors from over 450 practices across the UK. Patients are included from when they register with a GP and contribute data until death or they leave the practice.
These two data sources, which combine observational studies and electronic health records, have the power and generalisability to help clarify the added value of using repeat measures of risk factors for CVD risk prediction.
HOW WILL HEALTH USERS AND THE PUBLIC BENEFIT?
Findings from this research will help to inform and optimise current (and future) government initiatives in CVD risk assessment. This should lead to use of accurate risk assessment methods. Although the proposed work will focus on predictors for CVD, the methodology and strategies will also be widely applicable for the identification and added value of repeat measures of risk predictors in other diseases.
Cardiovascular disease (CVD), comprising mostly heart attacks and strokes, is UK's leading cause of death and disability.
WHAT IS CVD RISK ASSESSMENT?
Measurement of "risk factors" that are linked with greater risk of CVD (eg, older age, male sex, smoking, high blood pressure, low and high body mass index, high cholesterol) can identify people who may especially benefit from preventive action, such as lifestyle advice and/or medications. Various risk prediction models are used to summarise levels of several risk markers simultaneously and are used to classify patients into high/medium or low risk groups. All CVD risk prediction models are based on single measures of risk factors.
WHAT IS THE HEALTH SERVICE ALREADY DOING?
In 2008, the UK government announced a national initiative to conduct CVD risk assessment in people aged 40-74 years with no history of CVD. The checks are being done in general practices and pharmacies.
WHY IS RESEARCH NEEDED?
Although experts generally agree that CVD risk assessment has the potential to prevent disease, there is uncertainty about:
- do past measures of risk factors or changes in risk factors yield useful information in addition to current measures of risk factors?
- how can past and current measures of risk factors be incorporated into a statistical model?
- how should information on risk, when incorporating past and current measures of risk factors, best be summarised?
WHAT IS THE PROPOSED RESEARCH?
An interdisciplinary team of internationally recognised statisticians and epidemiologists will:
1. analyse detailed scientific databases with information on CVD risk factors measured more than once in patients with and without heart attacks or strokes in a total of about 4 million participants, thereby reliably determining the use of repeated measures of risk factors for CVD risk assessment;
2. use these databases to identify which CVD risk factors provide useful extra information when measured on more than one occasion for CVD risk assessment;
3. determine suitable statistical models that combine information from past and current measures of risk factors in order to provide up-to-date relevant CVD risk assessments for patients;
4. investigate the use of interactive technology to provide graphical displays of individual CVD risk over time.
WHAT DATA SOURCES ARE AVAILABLE FOR THIS RESEARCH?
We have two large databases for our investigations:
Emerging risk factors collaboration
The Emerging Risk Factors Collaboration has collated individual data on up to 500 characteristics in over 1.8 million participants in over 120 long-term studies. Within these data, approximately 280,000 participants have at least two measurements of the main CVD risk factors.
The Health Improvement Network
The Health Improvement Network (THIN) is a primary care database with anonymised clinical records entered by general practices on their computer computer system for patient management (ViSion). THIN now includes records from around 4 million people with repeat measures of risk factors from over 450 practices across the UK. Patients are included from when they register with a GP and contribute data until death or they leave the practice.
These two data sources, which combine observational studies and electronic health records, have the power and generalisability to help clarify the added value of using repeat measures of risk factors for CVD risk prediction.
HOW WILL HEALTH USERS AND THE PUBLIC BENEFIT?
Findings from this research will help to inform and optimise current (and future) government initiatives in CVD risk assessment. This should lead to use of accurate risk assessment methods. Although the proposed work will focus on predictors for CVD, the methodology and strategies will also be widely applicable for the identification and added value of repeat measures of risk predictors in other diseases.
Technical Summary
CURRENT CARDIOVASCULAR DISEASE (CVD) RISK MODELS
Most CVD risk models in clinical use have been constructed from Proportional Hazards models regressed on single measures of selected risk factors. Thus, clinical CVD risk assessments do not take into account repeated measures of risk factors if available.
MODELS FOR LONGITUDINAL MEASURES AND TIME-TO-EVENT DATA
There has been active statistical methodology research into constructing joint models for longitudinal measures and time-to-event data. Most joint models use a set of random effects to induce association between the longitudinal measures and time-to-events, but then require computational intensive approaches for estimation. Consequently, such models are not commonly applied to real problems despite their qualities and abilities to answer key questions. Whilst the relevance of using joint models to assess aetiological associations between longitudinal measures and disease risk is clear, it is less certain how risk predictions from such models can be applied for clinical CVD risk assessment.
WHAT IS THE PROPOSED RESEARCH?
The added value of using longitudinal risk factors for CVD risk prediction in observational and clinical data has not been thoroughly evaluated. We propose to determine statistical methods for developing, assessing and validating CVD risk prediction models using longitudinal measures of risk factors. We will:
1. determine an appropriate CVD risk prediction model that incorporates longitudinal measures of risk factors from multiple studies/centres.
2. determine how to extract and translate CVD risk assessments from a risk prediction model that incorporates longitudinal measures.
3. determine a suitable choice of prognostic measures and graphical and/or tabular presentations for assessing the added value of using longitudinal markers in risk prediction models.
4. quantify the added value of using longitudinal measures of risk factors for CVD risk prediction.
Most CVD risk models in clinical use have been constructed from Proportional Hazards models regressed on single measures of selected risk factors. Thus, clinical CVD risk assessments do not take into account repeated measures of risk factors if available.
MODELS FOR LONGITUDINAL MEASURES AND TIME-TO-EVENT DATA
There has been active statistical methodology research into constructing joint models for longitudinal measures and time-to-event data. Most joint models use a set of random effects to induce association between the longitudinal measures and time-to-events, but then require computational intensive approaches for estimation. Consequently, such models are not commonly applied to real problems despite their qualities and abilities to answer key questions. Whilst the relevance of using joint models to assess aetiological associations between longitudinal measures and disease risk is clear, it is less certain how risk predictions from such models can be applied for clinical CVD risk assessment.
WHAT IS THE PROPOSED RESEARCH?
The added value of using longitudinal risk factors for CVD risk prediction in observational and clinical data has not been thoroughly evaluated. We propose to determine statistical methods for developing, assessing and validating CVD risk prediction models using longitudinal measures of risk factors. We will:
1. determine an appropriate CVD risk prediction model that incorporates longitudinal measures of risk factors from multiple studies/centres.
2. determine how to extract and translate CVD risk assessments from a risk prediction model that incorporates longitudinal measures.
3. determine a suitable choice of prognostic measures and graphical and/or tabular presentations for assessing the added value of using longitudinal markers in risk prediction models.
4. quantify the added value of using longitudinal measures of risk factors for CVD risk prediction.
Planned Impact
TRANSLATIONAL OBJECTIVE: to develop a statistical methodology to allow for "electronic updating" of cardiovascular disease (CVD) risk assessments by incorporating past and present measures of risk factors into a patient's CVD disease-risk-profile.
1. STRATEGIES FOR CVD SCREENING -> POLICY MAKERS
The planned research has potential to:
* progress the state-of-the-art concerning use of longitudinal measures of risk factors and screening for CVD;
* determine the added value of merging electronic health records with CVD risk algorithms;
* optimize existing individual health records to improve CVD risk assessments leading to reduced costs for national CVD screening;
* increase the effectiveness of CVD screening.
The likely timescale for these impacts are towards the end of the project
2. IMPROVED CLINICAL RISK PREDICTION -> HEALTH CARE PROVIDERS AND USERS
The planned research has potential to:
* lead to provision of improved and more accurate CVD risk assessments;
* lead to innovative CVD risk scores that will make use of all clinical data available in a standard NHS longitudinal clinical record.
The likely timescale for these impacts are at the end of the project.
3. IMPROVED CLINICAL-DECISION MAKING AND CLINICAL OUTCOMES -> HEALTH CARE PROVIDERS AND USERS
The planned research has potential to:
* lead to improved efficiency of targeting preventive action;
* lead to enhanced quality of life and health;
* enhance summaries of CVD risk predictions by incorporating longitudinal measures of risk factors. Summarising risk in this manner may motivate individuals to adopt lifestyle measures or adhere to medication.
The likely timescale for these impacts are after the end of the project.
4. METHODOLOGICAL IMPACT -> BEYOND CARDIOVASCULAR DISEASE
The methodology and strategies will be widely applicable to other chronic diseases strategies for screening and prevention. The likely timescale for these impacts are towards the end of the project.
1. STRATEGIES FOR CVD SCREENING -> POLICY MAKERS
The planned research has potential to:
* progress the state-of-the-art concerning use of longitudinal measures of risk factors and screening for CVD;
* determine the added value of merging electronic health records with CVD risk algorithms;
* optimize existing individual health records to improve CVD risk assessments leading to reduced costs for national CVD screening;
* increase the effectiveness of CVD screening.
The likely timescale for these impacts are towards the end of the project
2. IMPROVED CLINICAL RISK PREDICTION -> HEALTH CARE PROVIDERS AND USERS
The planned research has potential to:
* lead to provision of improved and more accurate CVD risk assessments;
* lead to innovative CVD risk scores that will make use of all clinical data available in a standard NHS longitudinal clinical record.
The likely timescale for these impacts are at the end of the project.
3. IMPROVED CLINICAL-DECISION MAKING AND CLINICAL OUTCOMES -> HEALTH CARE PROVIDERS AND USERS
The planned research has potential to:
* lead to improved efficiency of targeting preventive action;
* lead to enhanced quality of life and health;
* enhance summaries of CVD risk predictions by incorporating longitudinal measures of risk factors. Summarising risk in this manner may motivate individuals to adopt lifestyle measures or adhere to medication.
The likely timescale for these impacts are after the end of the project.
4. METHODOLOGICAL IMPACT -> BEYOND CARDIOVASCULAR DISEASE
The methodology and strategies will be widely applicable to other chronic diseases strategies for screening and prevention. The likely timescale for these impacts are towards the end of the project.
Publications
Grootes I
(2018)
Predicting risk of rupture and rupture-preventing reinterventions following endovascular abdominal aortic aneurysm repair.
in The British journal of surgery
Harrison H
(2022)
Validation and public health modelling of risk prediction models for kidney cancer using the UK Biobank.
in BJU international
Jackson D
(2014)
A design-by-treatment interaction model for network meta-analysis with random inconsistency effects
in Statistics in Medicine
Jackson D
(2016)
Extending DerSimonian and Laird's methodology to perform network meta-analyses with random inconsistency effects.
in Statistics in medicine
Jenkins V
(2015)
Psychosocial Factors Associated With Withdrawal From the United Kingdom Collaborative Trial of Ovarian Cancer Screening After 1 Episode of Repeat Screening.
in International journal of gynecological cancer : official journal of the International Gynecological Cancer Society
Jochems SHJ
(2021)
Waist circumference and a body shape index and prostate cancer risk and mortality.
in Cancer medicine
Description | Covid-19 vaccine safety |
Geographic Reach | Multiple continents/international |
Policy Influence Type | Contribution to a national consultation/review |
Impact | Our results support the strategy of the UK Joint Committee on Vaccination and Immunisation (JCVI), which is to recommend the COVID-19 vaccine BNT162b2 (or the Moderna mRNA-1273 vaccine) for people under the age of 40 years. |
Guideline Title | European Society of Cardiology - Clinical Practice Guidelines |
Description | SCORE2 equations |
Geographic Reach | Europe |
Policy Influence Type | Citation in clinical guidelines |
Impact | • Improved and updated risk calculators allow tailored use among people aged 40+ to accurately predict who is at risk of having a heart attack or stroke in the next 5 or 10 years • People flagged as having increased risk are recommended personalised preventative treatment • Our tool, called 'SCORE2', has been adopted by the European Guidelines on Cardiovascular |
URL | https://www.escardio.org/Education/ESC-Prevention-of-CVD-Programme/Risk-assessment/esc-cvd-risk-calc... |
Description | Advancing primary prevention strategies for complex diseases |
Amount | £73,500 (GBP) |
Funding ID | HDRUK2023.0239 |
Organisation | Health Data Research UK |
Sector | Private |
Country | United Kingdom |
Start | 09/2023 |
End | 09/2026 |
Description | British Burden of Cardiovascular Disease |
Amount | £50,000 (GBP) |
Funding ID | HDRUK2023.0240 |
Organisation | Health Data Research UK |
Sector | Private |
Country | United Kingdom |
Start | 05/2023 |
End | 04/2024 |
Description | Characterisation, determinants, mechanisms and consequences of the long-term effects of COVID-19: providing the evidence base for health care |
Amount | £10,000,000 (GBP) |
Funding ID | MC_PC_20051 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2021 |
End | 02/2024 |
Description | Concomitant primary prevention of multiple chronic diseases through data-driven approaches mobilising population-wide longitudinal health records |
Amount | £2,000,000 (GBP) |
Funding ID | NIHR303137 |
Organisation | National Institute for Health Research |
Sector | Public |
Country | United Kingdom |
Start | 12/2023 |
End | 12/2028 |
Description | Data Science and Population Health theme of the NIHR Cambridge BRC |
Amount | £86,200,000 (GBP) |
Funding ID | NIHR203312 |
Organisation | National Institute for Health Research |
Sector | Public |
Country | United Kingdom |
Start | 12/2022 |
End | 11/2027 |
Description | EU: Innovative Medicines Initiative - "BigData@Heart" |
Amount | € 19,000,000 (EUR) |
Funding ID | 116074 |
Organisation | European Union |
Sector | Public |
Country | European Union (EU) |
Start |
Description | Efficient AI tools for equitable handling of missing values in population-wide e-health records to advance prevention of chronic diseases |
Amount | £618,984 (GBP) |
Funding ID | EP/Y017757/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2023 |
End | 04/2025 |
Description | Impact of COVID-19 on the association between Type 2 diabetes and incidence of cardiovascular diseases |
Amount | £50,000 (GBP) |
Funding ID | HDRUK2023.0242 |
Organisation | Health Data Research UK |
Sector | Private |
Country | United Kingdom |
Start | 05/2023 |
End | 04/2024 |
Description | Large-scale integrative studies of risk factors in coronary heart disease: from discovery to application |
Amount | £2,017,846 (GBP) |
Funding ID | MR/L003120/1 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 08/2013 |
End | 08/2018 |
Description | Looking beyond the mean: what within-person variability can tell us about dementia, cardiovascular disease and cystic fibrosis |
Amount | £486,957 (GBP) |
Funding ID | MR/V020595/1 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2021 |
End | 03/2024 |
Description | MRC Industrial Strategy PhD Award |
Amount | £360,000 (GBP) |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2018 |
End | 10/2021 |
Description | NIHR BTRU in Donor Health & Genomics |
Amount | £4,000,000 (GBP) |
Organisation | National Institute for Health Research |
Sector | Public |
Country | United Kingdom |
Start |
Description | Owen Taylor PhD |
Amount | £119,000 (GBP) |
Funding ID | RE/18/1/34212 |
Organisation | British Heart Foundation (BHF) |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 09/2021 |
End | 09/2024 |
Description | Phase 1 COVID-19 Longitudinal Health and Wellbeing - National Core Study |
Amount | £9,074,000 (GBP) |
Funding ID | MC_PC_20059 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2021 |
End | 09/2022 |
Description | Pump-priming proposals |
Amount | £50,000 (GBP) |
Organisation | British Heart Foundation (BHF) |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2015 |
End | 11/2015 |
Description | RCUK Innovation / Rutherford Fund Fellowships |
Amount | £760,000 (GBP) |
Organisation | Research Councils UK (RCUK) |
Sector | Public |
Country | United Kingdom |
Start | 07/2018 |
End | 08/2021 |
Description | The risk of stroke after SARS-CoV-2 in a UK population-wide cohort |
Amount | £60,000 (GBP) |
Funding ID | SA_CV_20/100018 |
Organisation | Stroke Association |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2021 |
End | 03/2022 |
Description | Towards early identification of adolescent mental health problems |
Amount | £100,577 (GBP) |
Funding ID | MR/T046430/1 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 06/2020 |
End | 10/2021 |
Description | Using machine learning for personalised CVD risk management |
Amount | £91,414 (GBP) |
Funding ID | BDCSA_100005 Wood |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start |
Description | Validation of SCORE2 10-year cardiovascular disease risk prediction models before and after the Covid-19 pandemic in the population of England |
Amount | £50,000 (GBP) |
Funding ID | HDRUK2023.0237 |
Organisation | Health Data Research UK |
Sector | Private |
Country | United Kingdom |
Start | 05/2023 |
End | 04/2024 |
Title | CVD-COVID-UK/COVID-IMPACT |
Description | See https://www.hdruk.ac.uk/projects/cvd-covid-uk-project/ CVD-COVID-UK established a novel population wide resource in partnership with NHS Digital, comprising of a range of linked datasets covering the entire population of England, including o hospital data o death registrations o primary care data o community dispensing data o Covid-19 vaccination data and lab test o Data from intensive care units and from cardiovascular specialist registries • |
Type Of Material | Data analysis technique |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | Results from analyses using this research database have informed national COVID-19 Advisory Groups and public health agencies on COVID-19 vaccine safety. |
URL | https://www.hdruk.ac.uk/projects/cvd-covid-uk-project/ |
Description | CRUK International Alliance for Cancer Early Detection - Real-world risk-stratified early de-tection and diagnosis using linked electronic health records data |
Organisation | University College London |
Department | Institute of Epidemiology and Health Care |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Leading statistical methods development and application |
Collaborator Contribution | Contributing clinical expertise |
Impact | Successful grant award for multi-disciplinary team science. Co-applicant: CRUK International Alliance for Cancer Early Detection - Real-world risk-stratified early detection and diagnosis using linked electronic health records data, £800K |
Start Year | 2020 |
Description | Causal inference |
Organisation | University of Cambridge |
Department | MRC Biostatistics Unit |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Statistical expertise |
Collaborator Contribution | Joint supervision of PhD student |
Impact | Papers submitted |
Start Year | 2017 |
Description | Machine learning and AI |
Organisation | University of Cambridge |
Department | Department of Applied Mathematics and Theoretical Physics (DAMTP) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Collaborating with Mihaela van der Schaar in various machine learning and AI projects. |
Collaborator Contribution | Contributing methods development and data |
Impact | Not yet |
Start Year | 2019 |
Description | Statistical methods in E-health records |
Organisation | University College London |
Department | Institute of Neurology |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Statistical expertise |
Collaborator Contribution | Scientific questions. Eg, Imputing missing ethnicity information |
Impact | Various projects - not yet published. |
Start Year | 2014 |
Description | Risk prediction workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Type Of Presentation | Workshop Facilitator |
Geographic Reach | International |
Primary Audience | Other academic audiences (collaborators, peers etc.) |
Results and Impact | I organised a workshop on "Statistical Challenges in Risk Prediction" with 30 international participants, in Cambridge Nov 2012. The output of the workshop will be a special journal issue in The Biometrical Journal. |
Year(s) Of Engagement Activity | 2012 |
Description | School work experience placement |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Schools |
Results and Impact | I organised a work experience placement for a local 6th form student with interest in combining maths with biology. |
Year(s) Of Engagement Activity | 2020 |
Description | Teaching at post-graduate level in MPhil courses |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | "Risk prediction" workshop day for MPhil in Epidemiology, Public Health and Primary care students. |
Year(s) Of Engagement Activity | 2014,2015 |