Efficient and effective use of longitudinal data

Lead Research Organisation: CARDIFF UNIVERSITY
Department Name: School of Medicine

Abstract

Information gathered in medical studies is often underused. Sometimes, this is because of problems
with data collection, where participants decline to provide answers to certain questions, or decide to
discontinue treatment before the end of a drug trial. Other times, a lack of simple tools to extract
relevant findings can restrict a researcher‘s ability to delve deeply into the data.
Much existing data is longitudinal, where study subjects are followed over several weeks, months or
years. Longitudinal data is particularly valuable because it is possible to observe changes in health
conditions resulting from life events or clinical interventions. One convenient approach to
understanding longitudinal data is to focus on these changes, using statistical models to separate
out the predictable effects of past lifestyle and treatment decisions from the unpredictable aspects
of human biology and behaviour.
This fellowship will provide statistical tools to make fuller use of longitudinal studies. These tools are
designed to deal simply with the uncertainty surrounding incomplete patient records, allowing
meaningful conclusions to be drawn about disease prognosis and progression. Re-examining existing
data using these new tools has considerable potential to improve human health through better
scientific understanding.

Technical Summary

In order to reason causally from medical studies, a minimal requirement is that the temporal
ordering of events is explicitly recognised. As this principal became established there was a
corresponding increase in the number of research projects recording many variables over several
waves of observation. Nevertheless, some statistical analyses of this type of data still fail to
recognise the critical role of time. Consequently, existing data sources contain a wealth of
information and a rich temporal structure that has not been fully exploited.
Introduced in my PhD thesis, the linear increments model focusses on observed changes in
measured quantities, and hence offers a natural framework for acknowledging study chronology.
This fellowship will develop and disseminate the linear increments approach, arguing for its routine
use with longitudinal designs. An important feature of the incremental approach is that missing
responses (due to dropouts, for example) pose no particular analytical problem, other than the
unavoidable reduction in sample size.
Several facets of the work will serve to make this statistical technology more relevant to medical
practitioners. One proposed development will more neatly distinguish aspects of the model that are
of principle interest from those of secondary importance, just as in a proportional hazards model the
effect of treatment on survival is emphasised over the precise baseline pattern of events. Another
extension will allow inference to centre on surviving patients, rather than on a manifestly
hypothetical immortal cohort. Often, multiple outcomes may be clinically pertinent, and so a third
aspect of the fellowship will be to enable exploration of the dynamics of related variables.
An important goal of the fellowship will be to make the case that the linear increments model is the
natural analogue of the popular Kaplan-Meier approach to event times. To this end, a unifying
mathematical theory of dynamic observation of stochastic processes will be sought, incorporating at
one extreme continuous-time survival, and discrete-time panel data at the other. Equally crucial will
be the successful application of the methodology to substantive questions arising in randomised
trials, observational studies and epidemiological research.

Publications

10 25 50