Statistical methods for understanding multilevel and longitudinal data with application to health studies in South Wales

Lead Research Organisation: CARDIFF UNIVERSITY
Department Name: School of Medicine

Abstract

Many epidemiological studies produce longitudinal data, where there are repeated measurements on the same individuals or groups. Unfortunately, these measurements may not be available for all individuals at all the measurement times, a problem known generically as missingness. The most common type of missingness is dropout, when individuals miss a measurement and then never reappear.

If people drop out because they are becoming increasingly unhealthy, then over time the surviving individuals are become less representative, since only the healthier people remain. Though this can introduce bias into conclusions drawn from the dataset, many scientists still ignore the problem entirely because the methods available for adjusting for dropout tend to be complicated and time-consuming. In this fellowship I will develop my recently proposed technique for dealing with dropout, and apply it to a 25-year study of cardiovascular disease in Wales.

Additionally, similar methods have the potential to be applied to another structure common in epidemiology, in which people cluster within neighbourhoods. While these techniques will be applicable in a wide variety of situations, this fellowship will focus in particular on two health studies from South Wales, and will investigate the dynamics and geography of physical and mental health in this region.

Technical Summary

This work is motivated by two studies set in the county borough of Caerphilly, and my continued interest in semi- and non-parametric methods for the analysis of multilevel and longitudinal data. The aims of the fellowship are to develop and disseminate readily applicable and interpretable techniques for understanding data which may be clustered, subject to potentially informative missingness, or high-dimensional. These will be applied to the Caerphilly Prospective Study (CPS) and the Caerphilly Health and Social Needs Study (CHSNS), which seek, respectively, to investigate individual risk factors for cardiovascular disease and cognitive decline, and to provide evidence for relationships between the common mental disorders and individual- and area-level variables.

Specific aims and objectives are:

(i) Develop statistical methods for dealing with potentially informative dropout in the CPS
- to adjust for informative missingness in the CPS using the linear increments (LI) approach
- to adapt the LI approach in order to accommodate uncorrelated measurement error
- to develop the LI approach to handle ordinal and categorical responses properly

(ii) Develop statistical methods appropriate for multilevel studies such as the CHSNS
- to design non-parametric models and graphical procedures for large multilevel datasets
- to produce improved techniques for assessing the variability of variance estimates
- to improve upon existing methodology for the measurement of neighbourhoods

(iii) Use these more robust techniques developed in (i) and (ii)
- to model the relationship between individual determinants of CVD and cognitive decline
- to analyse societal hierarchical determinants of mental health and common mental disorders
- to provide evidence-based recommendations to improve health and reduce health inequalities
The development of novel statistical methodology will build upon techniques developed during my PhD studies. My PhD work shows that studying the observed increments in a longitudinal process is one way of accounting for dropout bias in certain cases, but this fellowship will extend the increments technique to other everyday problems which are not presently within its scope.

My current work has shown that non-parametric methods for the analysis of ordinal, multilevel data can also be developed as models for the increments of a stochastic process. Using dynamic covariates, which allows for dependence between people in the same neighbourhood, is an appealing way to overcome sensitivity to the particular family of distributions chosen in parametric random effects modelling.

Results from these improved analyses of the Caerphilly studies will then be made available to policy makers in Wales.

Publications

10 25 50