Estimating features of trajectories in repeated measures data

Lead Research Organisation: University of Bristol
Department Name: Social Medicine

Abstract

Investigations of a disease or condition commonly involve repeated measurement of important biological values over time on a group or cohort of individuals with that condition. For example, after men with prostate cancer receive treatment for their disease, they are monitored to check that their cancer has not returned. Prostate specific antigen (PSA) is repeatedly measured through blood tests every few months. PSA is produced in the prostate but, vitally, more PSA is produced by tumour cells than healthy prostate cells so that a high value of PSA is a worrying sign for cancer recurrence. The idea behind monitoring these people is that an outcome, such as having a recurrence of cancer or dying from a disease, may occur at some point in the future. Knowing the outcome, we can investigate how the biological measurements over time differ for those who have one outcome (e.g. die from prostate cancer) compared to those who experience another (e.g. survive).

In many examples, these repeated measurements can display complex changes when we look at a plot of the measurements over time. In the prostate cancer example, PSA is well known to have high day to day variation such that repeated measures every few months will produce plots which show highly nonlinear changes in PSA over time. One problem is how best to summarise such repeated measurements. Some people look only at the average value of all of the measurements and ignore the fact that they were measured over time. Some believe that the most recent value is the only important one and that this should be used as a summary. Neither of these methods gives a sense of the behaviour of the measurements for an individual over time. I will be developing new methods to summarise the behavior in terms of certain features of the trend or trajectory. For example, I will look at the speed at which the measurements are changing, and the age at which the fastest change in measurement is found. Looking at better summaries of behaviour may allow us to better predict outcomes in the individuals, such as prostate cancer recurrence in the example of repeated PSA measurement.

The research will be applied in four examples, namely HIV infection, prostate cancer, blood pressure changes during pregnancy and in adolescents. Data on these four examples have already been collected as part of large group or cohort studies. I will be analysing the data with statistical software and developing methods which can be tested in that software. If these features do tell us about future outcomes it will be possible to alert and intervene in cases where a dangerous feature is found during routine clinical monitoring.

Technical Summary

Aim
I aim to develop methods to obtain features of trajectories from repeated measures data and relate these features to outcomes.

Objectives
Derivative estimates will be developed for several methods for analysing trajectories of repeated, non-linear data, and used to estimate features of a trajectory. I will develop inverse prediction intervals to incorporate uncertainty around the timing of a feature. A 2-stage model for the association between a feature and outcome (i.e. estimate a feature, then in a separate model relate it to the outcome) will be implemented. Finally, a method for jointly estimating a feature and outcome will be developed.

Methodology
The methods which I will consider for handling repeated measures data are P-Spline and semiparametric mixed models, functional Principal components Analysis through Conditional Expectation (PACE) and Bayesian direct gradient estimation.

Scientific/Medical Opportunities
Derivative estimation, inverse prediction and bivariate modelling will be key developments in longitudinal data analysis. In a range of disciplines it is often the case that the rate of change of observed data is of primary interest. Predicting the value of an explanatory variable at a given value of response is a very general problem in statistics. Intervals around such an estimate will be a valuable tool in repeated measures analysis. Simultaneous estimation of a feature and outcome is an approach linking repeated measurements with an outcome variable. Correct specification of the correlation between the feature and outcome is vital.

Four applications will be used to demonstrate the methods and direct conclusions from these will be valuable to the relevant clinical setting. Where novel insights are gained by using features of trajectories this will directly benefit that medical area. Further, these methods can be used in any application of repeated measures and I will promote this with dissemination throughout the fellowship.

Planned Impact

The three main developments of this fellowship, namely derivative estimation for repeated measures data, inverse prediction and bivariate estimation will each prove useful in the biostatistics community. Derivative estimation is becoming a larger element of analyses as our need to understand change grows with the collection of more and better data. The methods proposed here are at the forefront of this understanding. This research will lead to methods which can be applied in new areas without the need for new data to be collected. By using features of repeated measures as risk factors, I hope to gain a novel insight into the applications described in my Case for Support. This will have a multifaceted impact.

First, in the application itself if a feature of, say, PSA after treatment, is associated with recurrence of prostate cancer, then this may be used prospectively in the monitoring of patients post surgery or radiotherapy. This will potentially lead to earlier intervention by way of further treatment which will lead to cost savings and improved quality of life. The direct beneficiaries of this example would be men with recurrent prostate cancer and their families. Further, clinicians in charge of monitoring men post treatment will have an improved knowledge base on which to recommend further action. During the course of my current post I developed an Excel system which is designed for use by urologists in monitoring men on active surveillance for localised prostate cancer. This idea of an easy to use online or software package is certainly one which can be implemented given successful results from this fellowship. This could potentially impact on NHS policy where a feature has a strong association as a risk factor and is then built into such a system which is tested and rolled out to GPs or consultants. Obviously these impacts would be long term and would be achieved only after sufficient testing not included as part of the proposed fellowship.

Second, in the wider UK academic research community the impact of a successful application of these methods would be the increased awareness of analysing repeated measures in this way. The methods are not restricted to these four applications and through proper dissemination in journals, conferences and short courses the use of features could be employed in biomedical applications where repeated measures are regularly taken and in emerging fields such as epigenetics. Further, outside of public health, the methods could be used in finance, sports science and other environments where data is routinely and repeatedly collected on the same unit or individual. To enhance this transfer of knowledge, I will write tutorial style papers which will, along with attached software, maximise the impact of this fellowship.

Finally, through the University of Bristol Centre for Public Engagement (CPE), a novel insight into prostate cancer, for instance, can be relayed to the general public. At the moment, prostate cancer is receiving a lot of publicity and new advancements in the area are constantly being welcomed. The CPE will allow any potential applied impact of the research to be demonstrated to those who will benefit the most.
 
Description Developing inverse prediction for functional data analysis 
Organisation Columbia University Medical Center
Country United States 
Sector Academic/University 
PI Contribution Together with Dr. Sara Lopez-Pintado and Prof. Ian McKeague, I have developed an approach to prediction intervals for the timing of a feature of functional data. In short, biomedical data which are repeatedly measured on the same individuals often take nonlinear trajectories rather than following simple linear behaviour. My fellowship aims to summarise these trajectories in terms of key features and the timing of these features (e.g. the timing of a peak in the biomedical data over time). We have developed a prediction interval around the timing of these features, i.e. an interval on the x-axis.
Collaborator Contribution Dr. Lopez-Pintado and Prof. McKeague were responsible for overseeing my development of the method. R was then employed to test the approach empirically on both simulated and real data.
Impact We have submitted an abstract for Oral presentation at the International Biometrics Conference in July, 2016. We are currently drafting a manuscript for submission to the Biometrics journal, describing and testing the developed method.
Start Year 2015
 
Description Conference on Applied Statistics in Ireland, Cork 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was accepted for an oral presentation at the largest statistics conference in Ireland, CASI which took place in Cork in May 2015. Roughly 150 were in attendance and there was a healthy question and answer session afterwards. This talk led to collaboration with Dr. Caroline Brophy on data which were relevant to the MRC fellowship.
Year(s) Of Engagement Activity 2015
 
Description IBC 2016 - Biostatistics conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk given to the international biometrics conference in Victoria, Canada. Excellent discussion led to ideas for future collaboration and application.
Year(s) Of Engagement Activity 2016
 
Description International Society for Clinical Biostatistics Conference, Utrecht, Netherlands 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I was accepted for an oral presentation at the International Society for Clinical Biostatistics Conference in Utrecht (Netherlands) in August. Roughly 100 people were in attendance, these were statisticians from International Universities and industry. The talk was followed by excellent questions and discussion, which has led to new working links with Prof. Paul Eilers and Prof. Jutta Gampe.
Year(s) Of Engagement Activity 2015
 
Description Invited seminar, Columbia University 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited to present my fellowship work to the functional data analysis workshop at the Department of Biostatistics, Columbia University in October of 2015. This was attended by 15 academics, and was useful to being my 6 month collaborative visit at Columbia. This talk led to my introduction to (among others) Dr. Jeff Goldsmith, who I have since began collaborating with on developing methods in functional principal components analysis.
Year(s) Of Engagement Activity 2015
 
Description Invited seminar, Harvard University 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I was invited br Dr. Erin Dunn to present my ongoing fellowship work to her group at Havard Public Health. This talk was attended by 20 academics and was related to my application rather than methodological aims. Excellent discussion has led to a new application of the methods, namely in brain imaging data collected over time.
Year(s) Of Engagement Activity 2015
 
Description Invited seminar, London School of Hygiene and Tropical Medicine 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I was invited by Dr. Ruth Keogh to give the weekly statistics seminar to the LSHTM group in February 2015. The talk was attended by roughly 40 academics and was made available on the LSHTM website.
Year(s) Of Engagement Activity 2015
 
Description Invited seminar, National University of Ireland, Galway 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I was invited to present my fellowship work to the Statistics group seminar at NUI, Galway in April 2015. There were approximately 20 academics in attendance and discussion afterwards was helpful in planning future research within the fellowship.
Year(s) Of Engagement Activity 2015