Time-dependent Robust Joint Modelling: Analysing a wealth of longitudinal outliers

Lead Research Organisation: Queen's University Belfast
Department Name: Sch of Mathematics and Physics

Abstract

Joint modelling is a sophisticated technique that allows one to simultaneously analyse the evolution, over time, of repeated measurements from individuals and the impact this has on the time to a particular event of interest. Commonly, it is applied to medical applications where patients are observed over time with the aim of investigating how and why their responses change to treatment and how this affects their survival. From this, it is evident that such approaches can be applied to a vast array of research questions, from cancer research to the analysis of chronic diseases such as heart disease, diabetes, stroke, to name but a few. As a result of this advantage, the volume of research publications utilising joint models has exploded in the last few decades.

Despite this, however, only limited research efforts have been directed at investigating one of the key assumptions of these models: that the random terms within these models follow normal distributional assumptions. This prevailing assumption of normality is detrimentally impacted when longitudinal outliers are present. Simple removal of these outliers will not only reduce sample size but, more importantly, would exclude important cases which commonly guide innovation in biomedical sciences; it is typically the analysis of outlying cases which tell us more about disease progression. Instead, this research will advance robust joint modelling techniques which both restrict the impact of outliers, providing more accurate and precise estimates to be obtained, and allow a high level of precision in the identification of such outliers for further exploration.

However, this research area is in its infancy with the volume of work to date on robust joint modelling being currently somewhat limited. This is due to the potentially restrictive assumptions of the current methodology for these models i.e. that the impact of outliers is constant, unchanging over time. There are no established theoretical tools for handling such a situation, an undesirable situation that will be rectified through this research. To do so, I will develop a novel methodology, the time-varying outlier impacts (TOI) approach, which will allow the degree at which outliers are down weighed to change over time. Doing so, will allow more realistic scenarios to be modelled using such techniques, for example, modelling patients reaction to starting a new treatment, accounting for the fact that it will take time for them to adjust to the new treatment, which could result in outlying measurements being taken from such patients or all measurements taken from the patient outlying from the trends of the population.

Another reason for limited research utilising robust joint modelling techniques is the lack of available software to fit such models. It has only been in recent years, since the introduction of the JM software package in R in 2008, that software has become available to fit standard joint models. Each of these joint modelling software packages have normal distributional assumptions for the random terms and thus cannot handle the analysis of data which contains longitudinal outliers, providing biased and imprecise estimates in the presence of outliers. This issue will also be alleviated through the work undertaken in this project through the development of a software package in R for robust joint modelling that will utilise the newly developed TOI approach.

Planned Impact

The overall goal of this research is fundamental theoretical developments in robust joint modelling methodology. As such it will have wide and far reaching academic impacts, holding the potential to revolutionise this field of research through the removal of restrictive assumptions which have halted the utilisation of robust joint models in current literature. This academic impact will not only be felt within statistics but in the multitude of application areas that such research may be utilised, for example, renal research, as will be evidenced by the findings of the proposed work. Due to the ability of statistics to impact an array of applications in medicine, geostatistics, business analytics, astrostatistics, econometrics, environmental statistics and epidemiology, to name but a few, the EPSRC has deemed 'Statistics and applied probability' as a growth research area within the theme of 'Mathematical sciences', where, as stated by the EPSRC, "this research area provides economic, industrial and societal impact".

Examples of such societal impacts arise from the gains in understanding; for example, in renal research, patients' haemoglobin levels is an emerging biomarker whose volatility in the initial stages upon commencement of haemodialysis has great impacts on the risk of death of such patients. More generally, a significant impact of this work is the ability of robust joint models to identify individuals who are classified as outliers, patients who do not react the same to treatments as typical patients and thus are in need of more personalised treatment plans. Previous research suggests that such patients tend to have worse survival rates and therefore the enhancement of techniques able to identify such at-risk patients is a key societal impact. In the long term, this has the potential to have economic impact on the NHS as a method to enable personalised medicine to become a reality.

An additional important impact of this project would be the development of the skills of the personnel involved in the project. A concern expressed by the International Review of Mathematical Sciences (IRMS) 2010 is the fragility of UK statistics due to the shortage of researchers at various career stages. This project will help to address this issue giving both the PI and PDRA the opportunity to develop skills that will stand them in good stead for their future careers in the field, benefiting greatly from the expertise and guidance received from the research visits to and from our collaborators that this project will facilitate. Added value is given through the cross disciplinary nature of this project (statistics and medicine) which will further enhance the skills of the personnel involved.
 
Description This research established the theoretical development of both robust mixed and joint models with time-varying degrees of freedom, a measure which controls the extent to which the detrimental impact of longitudinal outliers are down-weighed. In allowing this to evolve over time, a better representation of the common situation where patients take time to stabilise and adjust to new treatments is gained, enabling clinicians to better understand the processes which cause outliers. This is illustrated through several medical applications which are explored in this work. Through exploration of the properties of the corresponding estimators, this research has enhanced the theoretical understanding of the impact of longitudinal outliers. To accelerate the translation of such techniques to practical applications, an open-source R software package, "robjm", has been developed for both robust mixed and joint models.
Exploitation Route Although in its infancy, this field of research has great potential. The synergy between individuals' repeated measurements and the time until an event of interest can be found in a wide variety of applications, such as renal, cancer and genetic research, amongst others. Thus, the theoretical developments and software package created in this work will be of significance to an extensive audience. As illustrated in the analysis of renal patients undertaken in this project, such methods allow a deeper understanding of the individual-specific reaction that patients can have to new treatments. Hence, this has the potential to be a key model in the promotion of precision medicine. Alongside medical researchers, this timely work will be of interest to the growing number of academics studying mixed and joint modelling both nationally and internationally, with the open-source R software, "robjm", providing a user-friendly way to put these novel techniques into practice.
Sectors Digital/Communication/Information Technologies (including Software)

Education

Environment

Financial Services

and Management Consultancy

Healthcare

Leisure Activities

including Sports

Recreation and Tourism

Manufacturing

including Industrial Biotechology

Pharmaceuticals and Medical Biotechnology

 
Description The statistical methodology and software that was developed through this grant is being utilised with the aim of improving patient care in the ICU. This is part of a collaboration with the PI for this grant, the ICU department at Royal Victoria Hospital, Belfast, and the Institute of Electronics, Communications & Information Technology (ECIT) at Queen's University Belfast. Intensive longitudinal and survival data are being gathered by my collaborators, a Consultant Intensive Care Physician in the Royal Victoria Hospital, Belfast, and a Senior Lecturer in Computer Science in ECIT at Queen's University Belfast. This data perfectly suits the methods developed by this grant, allowing the longitudinal trajectories of patients' biomarkers to be more accurately and precisely modelled. This enables the early detection and identification of patients who aren't reacting typically to treatments, allowing such patients to get specialist care and a change in treatment plans to lessen the impact or stop the development of hospital acquired infections, such as ventilator associated pneumonia, or complications in relation to mechanical ventilation. The illustration of the practical benefits of such methods has the potential to change medical policies in ICU departments in the future. In addition, work is ongoing to illustrate the precision, added interpretation and potential for the development of novel metrics in a sporting context through the utilisation of statistical methodology developed by this grant. Specifically, individual, real-time data has been collected from elite local football players and analysed using the novel statistical algorithms for both training and in-match scenarios. This work illustrates the potential for outlier identification provided by the statistical algorithms as an early warning system for adverse events, such as injury or, in rare circumstances, cardiac arrest. The identification of outliers in the stream of heart rate data has the potential to shape sporting policy, where major sports bodies now all run heart screening programmes for players at the beginning of their professional careers. The potential for this to be used as early warning system in practice will be done in close collaboration with relevant medical experts.
First Year Of Impact 2021
Sector Healthcare,Leisure Activities, including Sports, Recreation and Tourism
Impact Types Societal

Policy & public services

 
Description EPSRC Impact Acceleration Account Funding Award
Amount £14,980 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2023 
End 06/2023
 
Description EPSRC Impact Acceleration Account Funding Award
Amount £9,720 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2024 
End 05/2024
 
Description ICMS Knowledge Exchange Catalyst
Amount £78,760 (GBP)
Organisation International Centre for Mathematical Sciences 
Sector Public
Country United Kingdom
Start 08/2023 
End 08/2024
 
Title Identification of individuals whose medical trajectories differ from the rest of the population on average 
Description Statistical methods have been developed for the identification of longitudinal outliers, in particular, the capture of outlying measurements which vary overtime in the degree in which they outlie. These methods utilise the theory developed by this grant, applying Bayesian estimation to obtain a better understanding of the true underlying medical profile of patients to enable clinicians to identify those individuals who might not react in the same way as the overall population to typical treatment plans. 
Type Of Material Physiological assessment or outcome measure 
Year Produced 2021 
Provided To Others? Yes  
Impact Publication in 2021 in the international journal, Biometrical Journal, entitled "Robust joint modelling of longitudinal and survival data: Incorporating a time-varying degrees-of-freedom parameter" 
URL https://onlinelibrary.wiley.com/doi/full/10.1002/bimj.202000253?casa_token=0owbqjesG74AAAAA%3AozPQlB...
 
Title robjm software package 
Description An R software package, "robjm", has been developed which implements the novel methodology that has been established through this grant. The package is user-friendly and available via GitHub. A major reason for the limited implementation of robust joint modelling techniques in the past has been the lack of available software to fit such models. This package solves that issue by allowing users to fit a wide variety of robust linear mixed effects models and robust joint models, including the novel methodology that was developed by this grant - robust joint modelling which accommodates and downweighs the detrimental impact of time-varying longitudinal outliers. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? Yes  
Impact Several publications in international, high-quality journals have utilised the package to date. Publication in 2020 in the Statistical Methods in Medical Research Journal entitled "Dynamic predictions of kidney graft survival in the presence of longitudinal outliers" Publication in 2021 in the Biometrical Journal entitled "Robust joint modelling of longitudinal and survival data: Incorporating a time-varying degrees-of-freedom parameter" 
URL https://github.com/ozgurasarstat/robjm
 
Title Robust Mixed and Joint Modelling Methodology: Time-varying degrees of freedom 
Description This research established the theoretical development of both robust mixed and joint models with time-varying degrees of freedom utilising Bayesian approaches in the estimation of parameters. Previous robust mixed and joint models employed a restrictive assumption of time-constant degrees of freedom which this research shows to cause bias and inefficiency in the estimation of parameters under the realistic scenario that the impact of outliers varies over time. The open-source R software, "robjm", provides a user-friendly way to put these novel techniques into practice. 
Type Of Material Data analysis technique 
Year Produced 2018 
Provided To Others? Yes  
Impact Requests for further information and several researchers have informed us they intend to use our corresponding software package in R ("robjm" package). 
URL https://github.com/ozgurasarstat/robjm
 
Title robjm package in R 
Description Implementation of robust mixed and joint models with time-varying degrees of freedom utilising Bayesian approaches in the estimation of parameters. In addition to these new models, the package also allows estimation of mixed and joint models under a variety of assumptions: with normality assumptions for the random effects (to be utilised when no outliers are present), with time-constant degrees of freedom equal for both random effects and random error terms and with time-constant degrees of freedom differing for both random terms. The package fits mixed and robust joint models where one of the random effects or random error terms can be normally distributed with the other random term t-distributed. The package allows visualisation of the shape of the degrees of freedom when it varies over time as well as simulation of data assuming any of the models listed above. 
Type Of Material Computer model/algorithm 
Year Produced 2018 
Provided To Others? Yes  
Impact Several researchers have informed us they intend to use our software. 
URL https://github.com/ozgurasarstat/robjm
 
Description Özgür Asar - robjm 
Organisation Acibadem University
Country Turkey 
Sector Academic/University 
PI Contribution The research completed as part of this grant has been a collaborative work involving the research team and this collaborator. Upon completion of a systematic review of the literature, the research team has taken the lead in the theoretical development of the novel methodology that underpins this research. Working closely with the collaborator, the research team has worked on the development of a software package in R, "robjm". The research team is in the process of conducting a simulation study to both validate the software package and the novel methodology. Due to the size of the simulation study, the High Performance Computing facilities at Queen's University Belfast are being employed to undertake this study. Utilising the expertise gained from attendance at the Royal Society training course "Introduction to Public Engagement", the research team, alongside the collaborator, has disseminated the findings of this work at each stage of the project. To date, this has been achieved through invited presentations at the Royal Statistical Society seminar series in Northern Ireland, conference presentations at both national and international conferences and an invited talk at a Big Data Workshop in Istanbul. This work has strengthen the collaboration between the research team and this collaborator where a joint research bid is currently being written based on one of the many avenues of research that has emanated from this work.
Collaborator Contribution Working in conjunction with the collaborator, an R software package, "robjm", has been developed which implements the novel methodology that has been established through this grant. This collaborator's expertise in the development of software packages has been an important factor to the success of this work. In addition to regular Skype meetings, having the collaborator visit the UK for a week as part of this grant allowed in depth discussion on the research methods and fostered ideas for further enhancement of the software package. This has made the package more user-friendly and is envisaged to help encourage use of the package by both those in the statistical and medical domains. In addition to this, due to the complex nature of the problem under investigation, this collaborator's expert knowledge in the research area has been extremely beneficial. Having hosted the Research Associate at their university for several weeks in the course of this project, this collaborator has provided training to the Research Associate in the development of software packages in R and on the estimation techniques that this work utilises. Due to the direction that the research undertook, this collaborator also introduced the research team to a further collaborator who is one of the leading experts in the estimation area and with whom we are in the process of writing a journal publication which is due to be submitted soon.
Impact Non-Gaussian Statistical Models for Individualized Predictions Investigating the effect of longitudinal outliers on mixed effects models Robust mixed modelling: A new approach to handle time-varying outlier impacts Investigating the impact of time-varying outlier patterns: Longitudinal analysis of Northern Ireland renal patients Time-varying outlier impacts on robust mixed models with an application in renal research Longitudinal and survival analysis methods for modelling healthcare applications The impact of time-varying outliers on mixed effects models: a simulation study motivated by renal data
Start Year 2018
 
Description Improving Patient Care in ICU: Robust Modelling of Longitudinal & Survival Data 
Organisation Queen's University Belfast
Department Institute of Electronics, Communications and Information Technology (ECIT)
Country United Kingdom 
Sector Academic/University 
PI Contribution This research continues the work that was undertaken during the grant, applying the methods that were developed to help improve the care of patients in the ICU in the Royal Victoria Hospital. I am the statistical expert on the collaboration and am responsible for the study design and analysis utilising advance statistical techniques.
Collaborator Contribution The aim of the study is firstly to demonstrate the improvements which can be made using advanced statistical techniques within robust joint modelling of longitudinal and survival data. For this, my collaborators provide the infrastructure to be able to collect the necessary data in real-time from patients within the ICU department. I am working with my collaborator in the Institute of Electronics, Communications & Information Technology at Queen's University Belfast to incorporate machine learning techniques into the statistical processes, to further advance the methods. Our collaborator within the ICU department is a Consultant Intensive Care Physician and he advises on the practical side of the analysis to ensure we maximise the usefulness of outputs and to enable us to put our processes into practice to help patients.
Impact This collaboration is multi-disciplinary, bringing together experts in statistical modelling, computer science and a Consultant Intensive Care Physician.
Start Year 2021
 
Description Improving Patient Care in ICU: Robust Modelling of Longitudinal & Survival Data 
Organisation Royal Victoria Hospital, Belfast
Country United Kingdom 
Sector Hospitals 
PI Contribution This research continues the work that was undertaken during the grant, applying the methods that were developed to help improve the care of patients in the ICU in the Royal Victoria Hospital. I am the statistical expert on the collaboration and am responsible for the study design and analysis utilising advance statistical techniques.
Collaborator Contribution The aim of the study is firstly to demonstrate the improvements which can be made using advanced statistical techniques within robust joint modelling of longitudinal and survival data. For this, my collaborators provide the infrastructure to be able to collect the necessary data in real-time from patients within the ICU department. I am working with my collaborator in the Institute of Electronics, Communications & Information Technology at Queen's University Belfast to incorporate machine learning techniques into the statistical processes, to further advance the methods. Our collaborator within the ICU department is a Consultant Intensive Care Physician and he advises on the practical side of the analysis to ensure we maximise the usefulness of outputs and to enable us to put our processes into practice to help patients.
Impact This collaboration is multi-disciplinary, bringing together experts in statistical modelling, computer science and a Consultant Intensive Care Physician.
Start Year 2021
 
Description Jonas Wallin - Robust Joint Models with time-varying outlier impacts 
Organisation Lund University
Country Sweden 
Sector Academic/University 
PI Contribution The research completed as part of this grant has been a collaborative work involving the research team and this collaborator. Upon completion of a systematic review of the literature, the research team has taken the lead in the theoretical development of the novel methodology that underpins this research. Due to the complexity of the methods being developed, Bayesian estimation was employed. Working closely with the collaborator, the research team has worked on the development of a software package in R, "robjm", and utilised this software to conduct a simulation study to both validate the software package and the novel methodology. Due to the size of the simulation study, the High Performance Computing facilities at Queen's University Belfast are being employed to undertake this study. Utilising the expertise gained from attendance at the Royal Society training course "Introduction to Public Engagement", the research team has disseminated the findings of this work at each stage of the project. To date, this has been achieved through invited presentations at the Royal Statistical Society seminar series in Northern Ireland, conference presentations at both national and international conferences and an invited talk at a Big Data Workshop in Istanbul.
Collaborator Contribution Dr Wallin is an expert in Bayesian analysis. Due to the complexity of the methodology being developed as part of this grant, it was necessary to employ Bayesian estimation techniques. Dr Wallin provided advice and guidance on Bayesian analysis in the estimation of the robust joint model with time-varying outlier impacts that were introduced by this research.
Impact Robust Joint Models with Time-varying Degrees of Freedom. Robust joint models: The introduction of time-varying degrees of freedom. Robust joint modelling: A new approach to handle time-varying outlier impacts.
Start Year 2019
 
Title robjm package in R 
Description Implementation of robust mixed and joint models with time-varying degrees of freedom utilising Bayesian approaches in the estimation of parameters. In addition to these new models, the package also allows estimation of mixed and joint models under a variety of assumptions: with normality assumptions for the random effects (to be utilised when no outliers are present), with time-constant degrees of freedom equal for both random effects and random error terms and with time-constant degrees of freedom differing for both random terms. The package fits mixed and robust joint models where one of the random effects or random error terms can be normally distributed with the other random term t-distributed. The package allows visualisation of the shape of the degrees of freedom when it varies over time as well as simulation of data assuming any of the models listed above. 
Type Of Technology Software 
Year Produced 2018 
Open Source License? Yes  
Impact Several researchers have informed us they intend to use our software. 
URL https://github.com/ozgurasarstat/robjm
 
Description Big Data Workshop (Istanbul, June 2018) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact This was an invitation to present an invited talk at a one day workshop in Istanbul on statistical theory and applications. It consisted of a combination of lectures and invited talks. As the audience consisted of people from both industry and academia, this allowed wider dissemination of the newly developed statistical methodology and the analysis of real world data utilising these techniques.
Year(s) Of Engagement Activity 2018
 
Description Royal Statistical Society Northern Ireland Local Group Seminar Series (April 2018) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact This was an invited seminar as part of the Royal Statistical Society Northern Ireland Local Group Seminar Series. This was intended to disseminate the initial research findings with regards to both the statistical methodology and the analysis of real world data, inspiring people both in industry and academia to utilise the newly developed approaches. Being invited to present this talk provided a great opportunity to ensure that the theoretical developments and application findings uncovered so far reach a wide audience. In particular, that our findings were presented to both those in academia who focus on statistical methodology alongside those in industry for whom these methods and the medical insights gained would be of great interest. The talk was followed by a good discussion around the research findings.
Year(s) Of Engagement Activity 2018
 
Description Royal Statistical Society Northern Ireland Local Group Seminar Series (Dec 2018) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact This was an invited seminar as part of the Royal Statistical Society Northern Ireland Local Group Seminar Series which discussed the latest findings of the research and introduced the "robjm" software package in R that is being developed through this grant. People from both industry and academia were in attendance giving a great opportunity to ensure that the statistical methodology and the analysis of real world data was disseminated to a wide audience. The talk was followed by a good discussion around the research findings with suggestions for possible avenues for further enhancement of the work being given.
Year(s) Of Engagement Activity 2018
 
Description Royal Statistical Society Northern Ireland Local Group Seminar Series (May 2023) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This was an invited talk as part of the Royal Statistical Society Northern Ireland Local Group Seminar Series which discussed the latest findings of the research and its utilisation in analysing biomarker data from ICU patients, focusing on the association between the changes in biomarkers over time and the onset of serious adverse events, such as pneumonia. Particular discussion was given for the potential of the methodology to be used as the foundations for an early warning system within the ICU department in the Royal Victoria Hospital Belfast. People from both industry and academia were in attendance giving a great opportunity to ensure that the statistical methodology and the analysis of real world data was disseminated to a wide audience. The talk was followed by a good discussion around the research findings with suggestions for possible avenues for further enhancement of the work being given.
Year(s) Of Engagement Activity 2023
URL https://rss.org.uk/training-events/events/events-2023/local-groups/rssni-talk-wednesday,-may-3-rd-@-...