Making inference to the population from nationally representative longitudinal surveys with missing data due to nonresponse to biological data collect

Lead Research Organisation: University of Manchester
Department Name: Social Sciences


Large-scale surveys provide extremely important sources of data for research into a wide variety of key social issues. This is especially the case where the people selected to participate are part of a sample that is representative of the population and also where the survey is repeated at regular intervals for the same sample over time. Some such surveys also include a health data component where body measurements and biological samples are collected in addition to the survey questionnaire. This has created new and exciting possibilities for research into the interactions between social phenomena (such as values and behaviours), economic indicators (such as income) and markers of physical health (such as blood pressure and cholesterol levels).
A research area currently of particular interest involves measuring the physical markers of stress in order to investigate the potential social correlates and causes. Related to this is the concept of allostatic load which represents the physiological consequences of repeated or chronic stress and has been linked to socio-economic status (SES) and poor health outcomes. Biological survey data has the potential to give important insights and contribute to the understanding of the causes and effects of allostatic load levels when analysed over time with associated social and economic data.
However, this exciting research potential also brings new methodological challenges. Whilst survey samples can be selected to represent the general population, not everyone agrees or is able to participate. The extra steps required to collect biological data, in particular blood samples, can add to this inability and/or reluctance to respond positively to the survey request. This may lead to biases in the results of statistical analyses if these are generalised to the population as a whole without adequate compensation for the data that are missing. An example would be if those in the selected sample with certain health conditions were less likely to provide blood samples. Results obtained from using only the collected blood data to analyse the associations between SES and markers of chronic stress would not represent the whole sample, leading to potentially incorrect conclusions about the population.
The aim of this project is to address these challenges by developing and clarifying the appropriate methods to deal with missing biological data in surveys in order to make inference to the general population, particularly in reference to estimating the relationship between allostatic load and SES. This is both from the viewpoint of actions that can be taken before and during the fieldwork carried out to collect the data and also the analysis that takes place once the data are made available to researchers. The work focuses not only on evaluating and comparing existing missing data techniques but also on incorporating new forms of auxiliary data that can be collected during the survey process and investigating how these can be used to improve the robustness of current methods. In particular, this includes using additional information about the trained nurses that carry out data collection in the field, such as their sample allocation and experience. The auxiliary data also includes records taken by nurses during the process of contacting and gaining cooperation from sample members, such as the number and length of call attempts to a household.
The data for the project come from two large-scale surveys in the UK that include biological data collection:UKHLS and ELSA. UKHLS is an annual general population survey in the UK with a sample of 40,000 households at wave 1. Waves 2 and 3 were conducted during 2009-13 and included a nurse visit for subsamples of eligible respondents. ELSA is a survey of the older population in England and is carried out every two years. The sample at wave 1 included 11,500 respondents aged at least 50 in 2002. Nurse visits have been conducted every two years from wave 2 onwards


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/P000347/1 01/10/2017 30/09/2024
1911760 Studentship ES/P000347/1 01/10/2017 19/05/2023 Fiona Pashazadeh