Development and enhancement of Longitudinal Education Outcomes (LEO) data

Lead Research Organisation: University College London
Department Name: Learning and Leadership


Understanding how much individuals and society benefit from different education and training courses is vital for governments weighing up investments in education and skills. Access to data with rich information on education, training and earnings is crucial to estimating these benefits, and having a large enough sample to consider whether the benefits vary across different groups (e.g. by socio-economic background) or different areas of the country is crucial in informing important policy decisions, such as the extent to which investment in education and skills for disadvantaged individuals or those living in 'left-behind' areas will help 'level up' the country.

We have access to such data in England, known as the Longitudinal Education Outcomes (LEO) data, which links together education records, benefit records and tax records. These data have provided crucial new insight into how much individuals and areas benefit from higher education, for example. But, to date, access to these data has been restricted to a relatively small number of individuals and organisations. The data are becoming more widely available, but the number and complexity of the datasets included as part of LEO presents a substantial barrier to new users, as it means they have to invest a lot of time in understanding the data before they are able to use it effectively, and may mean some important research questions go unanswered as a result.

Moreover, the data could be even more useful if we were able to incorporate additional information. For example, if we could include information on the places where individuals work - and who works with whom - then we could understand how much investment in education and training benefits people's colleagues, and the businesses in which they work. Similarly, if we were able to link in information about which individuals applied to university, and where, and compare this to the offers they received and where they went, we could understand more about the role of individual preferences and university decisions in generating the strong links evident between socio-economic background, education choices and later outcomes.

Our project will fill both of these gaps. Specifically, it will:

1. Enhance existing LEO data by:
a. Creating a simplified and consistent set of variables summarising important pieces of information from the data, such as measures of educational attainment, employment and earnings, that researchers can use to help get them started with their analysis.
b. Linking in new contextual data, such as about the areas in which individuals live.
c. Sharing documentation, code and metadata for these newly created variables (in a. and b.)
d. Creating and running an online forum through which current and potential users can find information about the data and future developments, and seek help from other users.
e. Providing introductory and advanced training events to build capacity in use of the data.

2. Link in new data, including on the places where individuals work and, for those who applied, information on their university applications and offers, and:
a. Incorporate this data into each of the elements outlined under 1. for existing LEO data, i.e. produce documentation and consistent variables for these new data; merge in additional relevant contextual data (e.g. on the 'quality' of the higher education institutions applied for and attended); and build awareness and capacity in use of this new data by incorporating information into the online forum and providing bespoke training events and resources.
b. Undertake new research to demonstrate the value of this new data in addressing important policy-relevant questions, such as on the link between education and business productivity, and whether policies which give lower university entry offers to students from more disadvantaged backgrounds are effective in improving outcomes for these individuals.


10 25 50