Discriminant Function Analysis for Longitudinal Data: Applications in Medical Research (DiALog)

Lead Research Organisation: University of Liverpool
Department Name: Institute of Translational Medicine

Abstract

Identifying the correct diagnosis as early as possible, especially before the patient presents clear structural and/or functional changes, is key for a successful treatment. For example, delays in the treatment of patients with suspected encephalitis can have a devastating impact, including severe cognitive disability. Hence, it is not surprising that in most medical disciplines research focuses on identifying biomarkers and risk factors, some of them measured over time, to correctly predict patient outcomes. We propose to develop a novel statistical methodology that allows us to identify patients at higher risk of developing a particular disease or condition sooner than is currently achieved, with subsequent benefits for patients, clinicians and for health care system costs.

Delay in the detection of a disease is directly linked to economic burdens of the National Health Service and communities. When mechanisms are put in place to achieve early detection, they however may be economically demanding. Patients with diabetes, for instance, are screened annually for diabetic retinopathy at a considerable cost to the NHS. Bearing in mind that less than 4% of patients with diabetes will develop diabetic retinopathy within a year, there is a great interest in being able to identify patients with a higher risk in order to tailor the screening intervals. In other words, while patients with higher risk should be screened more often than once per year, the low risk group (involving 96% of the study population) could be screened less often than annually, reducing significantly both NHS costs and burden on patients and clinicians. Considering that in the UK alone there will be about 4 million people with diabetes by 2025, individualised screening is vital and it is expected to reduce annual costs to the NHS by more than £100 million without reducing medical screening efficacy.

In addition to the importance of early diagnosis, being able to identify early that someone is likely to show a poor prognosis is also essential to improve clinical management and optimise resources. For example, approximately one third of patients with epilepsy do not achieve remission from seizures following drug therapy. Early identification of this patient group would allow clinicians to focus on alternative treatments (e.g., surgery) as early as possible.

We aim to develop a novel time-dependent approach for discriminant analyss that (a) depends directly on both the individual baseline covariate information and the longitudinal data to achieve a more precise classification, (b) allows us to detect the earliest time point at which successful classification can be achieved (with a predefined error), and (c) can be applied to classify individuals into two or more groups, with unequal variance-covariance matrices among groups and incorporation of costs. Our objectives are divided into 3 main parts: development, implementation and clinical applications.

Technical Summary

Identifying the correct diagnosis as early as possible, especially before the patient presents clear structural and/or functional changes, is key for a successful treatment. For example, delays in the treatment of patients with suspected encephalitis can have a devastating impact, including severe cognitive disability. Hence, it is not surprising that in most medical disciplines research focuses on identifying biomarkers and risk factors, some of them measured over time, to correctly predict patient outcomes. We propose to develop a novel statistical methodology that allows us to identify patients at higher risk of developing a particular disease or condition sooner than is currently achieved, with subsequent benefits for patients, clinicians and for health care system costs.

Discriminant function analysis can be used to make predictions of the group to which a patient most likely belongs (e.g., groups defined by high/low risk of developing diabetic retinopathy within a year). The use of longitudinally observed data can be particularly powerful in discriminant analysis. In longitudinal data analysis, responses are recorded at different time points and/or different locations, adding a longitudinal dimension to the data set. A promising approach to discrimination using longitudinal data for clinical applications is via extension of linear mixed models.

We aim to develop a novel time-dependent approach for discriminant analysis that (a) depends directly on both the individual baseline covariate information and the longitudinal data to achieve a more precise classification, (b) allows us to detect the earliest time point at which successful classification can be achieved (with a predefined error), and (c) can be applied to classify individuals into two or more groups, with unequal variance-covariance matrices among groups and incorporation of costs. Our objectives are divided into 3 main parts: development, implementation and clinical applications.

Planned Impact

We see this project as a DiALog between clinicians and statisticians leading to a novel methodology that allows us to identify patients at higher risk of developing a particular disease or condition sooner than is currently achieved, with subsequent benefits for patients, clinicians and for health care system costs.
Our proposed research will have a broad range of impacts with different timescales.

Short-term impact (1-3 years and beyond)

Advance of scientific knowledge
We propose to develop a novel methodology in the area of discriminant function analysis that innovatively incorporates recent advancements from longitudinal methods. In a close collaboration with clinicians we will develop a time-dependent approach for discriminatory analysis that will mark a turning point in the way multivariate clinical data are analysed, strengthening evidence-based health decisions. Specifically, it will allow us to give recommendations to clinical researchers on the time points at which data need to be collected to achieve a chosen precision while minimising costs. Furthermore, the new methodology will lead to individualised assessments of risks applicable to any medical area. Finally, in addition to clear advancement of discriminant bio-statistical methods, this project provides a good platform to translate the outcomes into clinical practice (see 'Activities' and 'Medium and long term impact' below).

Development of a freely available statistical package
We will construct a statistical package to be implemented into the software R where the methodology developed and validated in Task1 will be implemented. This package will be freely available for the operating systems: windows, mac and linux. We will make use of the applications to clinical data to create worked-examples tailored to clinical researches for a better understanding of the concepts and assumptions behind the new methods.

Activities:

(i) Workshops: We will design, develop and deliver a workshop so that the state-of-the-art methods and statistical package are effectively disseminated.

(ii) Multidisciplinary research meeting across the UK. We will use our collaboration with the Royal Statistical Society to effectively disseminate the findings in multidisciplinary venues.

(iii) Development of a website for DiALog (to inform of DiALog research activities), which will be updated and maintained during the lifetime of the project.

(iv) Other activities include seminars, conferences, publications and academic visits.


Medium and long-term impact (2-4 years and beyond)
Impact to NHS policies and costs
DiALog will make it possible to identify patients who are at higher risk of developing a disease or condition and does this at the earliest time point possible. This will allow clinicians to act upon it in advance with appropriate preventive measures. Patients with diabetes, for instance, are screened annually for diabetic retinopathy at a considerable cost to the NHS.

Patients
Our fundamental goal with this proposal is to contribute to better health. The applicants of DiALog are either PIs or co-applicants of several medical research projects in areas such as infectious diseases, diabetes and diabetic retinopathy, epilepsy, stroke and cancer among others. We have developed experience in discussion with clinicians and patient groups on how to translate the statistical results into findings that are relevant to them. DIALOG will allow us to develop a platform with new clinical applications, and the clinical findings will be disseminated via meetings, the DiALog website, newsletters, study updates, workshops and conferences, invited talks and publications.

Publications

10 25 50

publication icon
Hughes DM (2021) Serum Levels of a-Fetoprotein Increased More Than 10 Years Before Detection of Hepatocellular Carcinoma. in Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association

publication icon
Hughes DM (2018) A comparison of group prediction approaches in longitudinal discriminant analysis. in Biometrical journal. Biometrische Zeitschrift

publication icon
Lythgoe D (2018) Latent Class Modeling with A Time-To-Event Distal Outcome: A Comparison of One, Two and Three-Step Approaches in Structural Equation Modeling: A Multidisciplinary Journal

publication icon
Probert C (2020) Faecal volatile organic compounds in preterm babies at risk of necrotising enterocolitis: the DOVE study in Archives of Disease in Childhood - Fetal and Neonatal Edition

 
Description (RenalToolBox) - Developing novel tools and technologies to assess the safety and efficacy of cell-based regenerative medicine therapies, focusing on kidney disease
Amount € 4,071,175 (EUR)
Funding ID 813839 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 11/2018 
End 10/2022
 
Description Brain architecture and connectivity at epilepsy diagnosis: markers of cognitive dysfunction and pharmacoresistance
Amount £778,719 (GBP)
Funding ID MR/S00355X/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 05/2019 
End 04/2024
 
Description British and Irish Biometric Society/Fisher Memorial Trust Travel Bursary
Amount £400 (GBP)
Organisation International Biometrics Society 
Sector Academic/University
Country Global
Start 04/2017 
End 04/2017
 
Description EPSRC Centre for Mathematical Sciences in Health Care
Amount £2,500,000 (GBP)
Funding ID EP/N014499/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 12/2019
 
Description National Productivity Investment Fund
Amount £321,265 (GBP)
Funding ID MR/R024847/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 11/2017 
End 11/2020
 
Description Neurodevelopment after prenatal exposure to seizures (NAPES) Study
Amount £149,963 (GBP)
Funding ID P1703 
Organisation Epilepsy Research UK 
Sector Charity/Non Profit
Country United Kingdom
Start 03/2018 
End 02/2021
 
Description Workshop Scholarship for Young Researcher
Amount £500 (GBP)
Organisation University of Warwick 
Sector Academic/University
Country United Kingdom
Start 07/2015 
End 07/2015
 
Description Dissemination of Dialog work at research meetings (including invited talks) at Universities across the UK and outside (including KCL, Keele, Birmingham, Lisbon) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The intention of these research meetings is to disseminate the novel multivariate discriminant approach for clinical prediction that we recently developed for the grant Dialog. At this meetings we engage with researchers, academic staff and students from a range of disciplines (mathematics, computer sciences, statistical and a wide range of clinical areas). Discussions focussed on ways to improve the current approaches for early detection of diseases.
Year(s) Of Engagement Activity 2017,2018
 
Description International conference (Netherlands) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation to talk about Early Detection of Diabetic Retinopathy using Personalised Longitudinal Discriminant Analysis
36th Annual Conference of the International Society for Clinical Biostatistics, August 23-27 2015, Utrecht, Holland.
Year(s) Of Engagement Activity 2015
 
Description MRC Festival of Medical Science 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact The Medical Research Council held a festival of Science event in Liverpool in 2016. As part of a project funded by the MRC, Dr David Hughes created a poster describing the statistical work we have done, aimed at informing the general public about the MRC funded work being undertaken in Liverpool. The event was well attended, many interesting discussions were had with attendees. Of particular interest were conversations with attendees who happened to have the particular conditions we were researching, or had family members who did. This allowed interesting transfer of information in both directions, giving us a patient focused perspective and allowing us to explain some of the benefits that could result from our work.
Year(s) Of Engagement Activity 2016