Developing strategies for handling missing data in time-to-event analyses: Incorporating variable selection, variable transformation and time-varying

Lead Research Organisation: London Sch of Hygiene and Trop Medicine
Department Name: Epidemiology and Population Health


Missing data is a very common problem in many areas of research, including in survival studies, which are the focus of this research. It can lead to biased estimates of associations between explanatory variables and time to events and a loss of power to detect such associations. Many methods have been used to handle missing data, such as complete-case analysis (ignoring individuals with missing data) and simple imputation approaches, such as mean imputation or regression imputation. However, these methods each are restricted to certain assumptions about the missing data or can suffer from underestimating the degree of uncertainty in estimates. Multiple imputation is a more robust and efficient method of handling missing data, which has become quite popular, by suggesting plausible values to substitute for the missing value. This method, while widely used, has not yet been fully developed to deal with many issues that are faced in practice such as handling time-varying effects of explanatory variables, incorporating flexible transformation of explanatory variables, incorporating variable selection, allowing time-dependent variables, and allowing for measurement error, all of which need to be considered in an analysis. The aim of this project is to develop flexible approaches which incorporate these issues. I will also test the methods using simulation studies, and apply them to real-world data sets. The final product will be an algorithm and worked example with developments made in relevant software to incorporate these new features. Impact: The outcome of this work will be rigorous statistical methods that enable statisticians, epidemiologists and other researchers to handle missing data in their analysis of survival studies. This should help reduce the loss of power to detect associations and lead to less biased estimates, therefore improving the way these studies are performed and improving conclusions. Multiple imputation can be used as an approach to handle other issues, including in adjustment for verification bias, and in observation studies aiming to investigate causal effects using potential outcomes. The work I would do has potential in other areas, therefore broadening the scope of problems it can deal with will benefit these areas. Collaboration: Completing this studentship will provide training and opportunities to collaborate with other researchers. There is the opportunity to become involved with the STRATOS (STRengthening Analytical Thinking for Observational Studies) initiative, which is a group of international experts formed with the aim of providing accessible and accurate guidance in the design and analysis of observational studies. This project being relevant to several topic groups such as: Missing data, Selection of variables and functional forms in multivariable analysis and Survival analysis. Another group of interest is the MiDIA group which is concerned with statistical analyses involving missing data and increasing awareness of the problems faced with missing data. Priorities of ESRC: Survival analysis is used in many fields of research, in particular in medical, sociological, economic and public health studies. Many of the data sets available for use, such as routinely collected data, are subject to missing data. The methods developed in this project are therefore expected to be of importance for several priority areas for the ESRC. Training and Skills: Conducting a literature review will improve research strategy skills and knowledge of methods. ESRC Core skills training session: improving networking and presentation skills as well as the ability to express and clarify ideas. Attendance at courses involving missing data, simulation studies, flexible modelling and survival analyses will improve knowledge base. Attendance of a course of the Academy for PhD Training in Statistics will provide further training in modern statistical methods and opportunity to network with fellow students.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/P000592/1 01/10/2017 30/09/2027
1922791 Studentship ES/P000592/1 25/09/2017 30/01/2021 Orlagh Carroll