📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Development and enhancement of Longitudinal Education Outcomes (LEO) data

Lead Research Organisation: UNIVERSITY COLLEGE LONDON
Department Name: Learning and Leadership

Abstract

Understanding how much individuals and society benefit from different education and training courses is vital for governments weighing up investments in education and skills. Access to data with rich information on education, training and earnings is crucial to estimating these benefits, and having a large enough sample to consider whether the benefits vary across different groups (e.g. by socio-economic background) or different areas of the country is crucial in informing important policy decisions, such as the extent to which investment in education and skills for disadvantaged individuals or those living in 'left-behind' areas will help 'level up' the country.

We have access to such data in England, known as the Longitudinal Education Outcomes (LEO) data, which links together education records, benefit records and tax records. These data have provided crucial new insight into how much individuals and areas benefit from higher education, for example. But, to date, access to these data has been restricted to a relatively small number of individuals and organisations. The data are becoming more widely available, but the number and complexity of the datasets included as part of LEO presents a substantial barrier to new users, as it means they have to invest a lot of time in understanding the data before they are able to use it effectively, and may mean some important research questions go unanswered as a result.

Moreover, the data could be even more useful if we were able to incorporate additional information. For example, if we could include information on the places where individuals work - and who works with whom - then we could understand how much investment in education and training benefits people's colleagues, and the businesses in which they work. Similarly, if we were able to link in information about which individuals applied to university, and where, and compare this to the offers they received and where they went, we could understand more about the role of individual preferences and university decisions in generating the strong links evident between socio-economic background, education choices and later outcomes.

Our project will fill both of these gaps. Specifically, it will:

1. Enhance existing LEO data by:
a. Creating a simplified and consistent set of variables summarising important pieces of information from the data, such as measures of educational attainment, employment and earnings, that researchers can use to help get them started with their analysis.
b. Linking in new contextual data, such as about the areas in which individuals live.
c. Sharing documentation, code and metadata for these newly created variables (in a. and b.)
d. Creating and running an online forum through which current and potential users can find information about the data and future developments, and seek help from other users.
e. Providing introductory and advanced training events to build capacity in use of the data.

2. Link in new data, including on the places where individuals work and, for those who applied, information on their university applications and offers, and:
a. Incorporate this data into each of the elements outlined under 1. for existing LEO data, i.e. produce documentation and consistent variables for these new data; merge in additional relevant contextual data (e.g. on the 'quality' of the higher education institutions applied for and attended); and build awareness and capacity in use of this new data by incorporating information into the online forum and providing bespoke training events and resources.
b. Undertake new research to demonstrate the value of this new data in addressing important policy-relevant questions, such as on the link between education and business productivity, and whether policies which give lower university entry offers to students from more disadvantaged backgrounds are effective in improving outcomes for these individuals.
 
Description Our grant has supported, and continues to support, the development of a range of training and capacity building resources to increase knowledge and understanding of education administrative data, with a particular focus on the Longitudinal Education Outcomes dataset. We have already delivered training to over 200 members of the research community, with around 35-40% coming from non-academic audiences - primarily central government departments, including the Department for Education, but also policymakers from Northern Ireland, local government, and a range of third sector organisations. Running our courses online means that we can also more easily reach stakeholders from around the country, including Manchester, Edinburgh, Bristol and Swansea, and we have provided follow-up support to a number of participants, enabling them to move forward with data applications. The partnership we have developed with the LEO programme team at the Department for Education - and in particular the secondment arrangements that we worked to put in place - have enabled us to build new partnerships and collaborations, including with other parts of the Department, increasing the impact of our work on this grant. For example, we have been an integral part of the team working to understand the key requirements of a new administrative dataset to measure the Opportunity Mission's key metric of intergenerational income mobility, and were instrumental in identifying the chosen solution, which is now being implemented. Our collaborations with the former Unit for Future Skills (now Skills England) and HE access team are further evidence of the opportunities and potential for impact generated by being embedded within the Department, which this grant facilitated. These activities would likely not have occurred in the absence of this grant.
First Year Of Impact 2023
Sector Education
Impact Types Societal

Policy & public services

 
Description Development and delivery of introduction to GRADE course
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact We improved the knowledge of participants on our course, enabling them to apply for and use GRADE data more effectively, with the potential to increase the quantity and quality of research in the public benefit.
URL https://www.eventbrite.com/cc/cepeo-adruk-administrative-data-training-courses-3888843
 
Description Development and delivery of introduction to LEO course
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact We improved the knowledge of participants on our course, enabling them to apply for and use LEO data more effectively, with the potential to increase the quantity and quality of research in the public benefit.
URL https://www.eventbrite.com/cc/cepeo-adruk-administrative-data-training-courses-3888843
 
Description Development and delivery of introduction to NPD and its linked data course
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact We improved the knowledge of participants on our course, enabling them to apply for and use NPD, LEO, GRADE and GUiE data more effectively, with the potential to increase the quantity and quality of research in the public benefit.
URL https://www.ucl.ac.uk/ioe/departments-and-centres/centres/centre-education-policy-and-equalising-opp...
 
Description Development and delivery of more detailed introduction to NPD course
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact We improved the knowledge of participants on our course, enabling them to apply for and use NPD data more effectively, with the potential to increase the quantity and quality of research in the public benefit.
URL https://www.eventbrite.com/cc/cepeo-adruk-administrative-data-training-courses-3888843
 
Description Invited to sit on ADRUK youth transitions community catalyst steering group
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Invited to sit on Data Access and Engagement Programme advisory group
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Invited to sit on LEO and PPMD Integration and Development Project Board
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Investing In Digital Skills For Research: Education Administrative Data And BeYond (IDS-READY)
Amount £407,876 (GBP)
Funding ID UKRI306 
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 09/2024 
End 03/2027
 
Description Collaboration with the Department for Education's HE access team 
Organisation Department for Education
Country United Kingdom 
Sector Public 
PI Contribution We are updating analysis of the role of prior attainment in driving inequalities in access to higher education. Previous work undertaken by our team is still regarded as the 'go-to' source of information on this for the Department, despite being about 10 years old. We are adding to the team's evidence base for policy development by updating these findings and also extending them to consider mature learners.
Collaborator Contribution Staff at DfE facilitated access to more recent years of HESA data than are available via the Secure Research Service, and are providing the conduit through which these results can be shared with ministers and policy officials in the Department.
Impact The work is still ongoing. The collaboration is multi-disciplinary, including both analysts (social scientists) and policy officials (disciplines unknown).
Start Year 2024
 
Description Collaboration with the former Unit for Future Skills 
Organisation Department for Education
Country United Kingdom 
Sector Public 
PI Contribution We are providing quality assurance for a new linked administrative-survey dataset - the Longitudinal Education Outcomes dataset linked to the Annual Survey of Hours and Earnings - as well as contributing new research findings on a topic of interest to the Unit for Future Skills (now Skills England).
Collaborator Contribution Staff at the Unit for Future Skills facilitated access to the LEO-ASHE data and shared their knowledge and understanding of the data, and code, to support us with our work.
Impact The work is still ongoing. The collaboration is primarily with economists.
Start Year 2024
 
Description Partnership with the Department for Education LEO Programme team 
Organisation Department for Education
Country United Kingdom 
Sector Public 
PI Contribution All members of the UCL team funded by this grant are on part-time secondment to the LEO Programme team at the Department for Education (DfE), who are responsible for the development and sharing of LEO data. We are fully embedded within the LEO Programme team, with our work on this grant supporting them to deliver their objectives to enhance the usability and use of LEO data amongst the external research community. With the arrival of the new government, this team's remit expanded to consider how to develop data to capture the Opportunity Mission's key metric for success, which our team are also supporting them to deliver.
Collaborator Contribution The LEO Programme team at DfE are responsible for developing and enhancing the LEO data and supporting resources to serve the needs of researchers both inside and outside government. They are the gatekeepers to the LEO data and provide the conduit through which we can access data to derive new variables and create shareable code, improve documentation, and liaise with contacts elsewhere within DfE (e.g. to determine the best approach to creating and sharing synthetic data) and in other government departments (e.g. ONS, to explore how we can share code with researchers inside the SRS).
Impact So far the partnership has resulted in co-delivery of various training and capacity building events detailed elsewhere, including the ADRUK public engagement discussion around LEO in September 2022; the grant launch event in October 2022; the ADRUK pre-conference workshop in November 2023; the LEO training courses in March 2024 and January 2025; the updated LEO gov.uk webpages and user guide. Ongoing work includes the creation and sharing of derived variables, the creation and sharing of a low fidelity synthetic version of the LEO data, and the development of a new linked data resource which will bring together family income in childhood and adult earnings with a view to being able to estimate intergenerational income mobility. The team are also a partner on the further funding we have secured to extend our training and capacity building activities (detailed elsewhere). This collaboration is with a non-academic partner whose expertise centres around project management and delivery, so is multi-disciplinary.
Start Year 2022
 
Description ADRUK Cambridge workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact I gave a talk as part of an ADRUK Workshop on Administrative Data for Public Policy Research organised by the University of Cambridge. The purpose was to raise awareness of different types of education administrative data and how they could be used to address policy-relevant research questions. Audiences were engaged and asked a number of questions about the data.
Year(s) Of Engagement Activity 2023
URL https://www.educ.cam.ac.uk/events/workshops/adruk/
 
Description ADRUK blog on value of LEO data 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Authored a blog, published on the ADRUK website, highlighting the innovative nature of the LEO data with a view to increasing its use for research in the public benefit.
Year(s) Of Engagement Activity 2024
URL https://www.adruk.org/news-publications/news-blogs/how-the-longitudinal-education-outcomes-data-is-b...
 
Description ADRUK pre-conference workshop 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We successfully applied to host a pre-conference workshop at the ADRUK conference in November 2023. This also acted as a second LEO user group meeting. The purpose was to bring together the LEO user community again, share developments and receive feedback on future plans. The main outcome was to raise awareness of the new iteration of LEO data that had recently been made available to external researchers.
Year(s) Of Engagement Activity 2023
URL https://virtual.oxfordabstracts.com/#/event/4218/program?session=79378&s=269
 
Description ADRUK public engagement discussion on LEO 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact ADRUK convened a group of third sector organisations whose work could usefully be informed by LEO data/research to understand their perceptions of its value, and any risks they identified in it being used for research purposes. A report summarising the discussion was subsequently published (URL below).
Year(s) Of Engagement Activity 2022
URL https://www.adruk.org/fileadmin/uploads/adruk/Documents/PE_reports_and_documents/LEO_report_key_mess...
 
Description ADRUK training and capacity building workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Participated in a workshop organised by ADRUK with a view to sharing information about training and capacity building resources and discussing ways in which to enhance the coverage and reach of these activities in future.
Year(s) Of Engagement Activity 2024
 
Description Blog to accompany LEO I2 release 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact I wrote a blog for the ADRUK website to accompany the launch of the second iteration of LEO data. The aim was to promote the data, particularly the new datasets that had been linked in, drawing attention to its potential to address policy-relevant research questions for the public good.
Year(s) Of Engagement Activity 2023
URL https://www.adruk.org/news-publications/news-blogs/new-longitudinal-education-outcomes-data-made-ava...
 
Description DfE HE team workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Participated in a workshop hosted by the Department for Education's higher education team to share exemplar research being undertaken as part of our grant which is relevant to their portfolio, including on university admissions and the 'spillover' effects of tertiary education.
Year(s) Of Engagement Activity 2025
 
Description Dialogue with FFT re. synthetic data 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact We met with representatives from FFT Education Datalab several times to discuss plans for the creation of NPD and LEO synthetic data. This ongoing dialogue resulted in a partnership on a new funding application (detailed in the relevant section) which we successfully submitted to an invited call to UKRI's Digital Research Infrastructure programme to co-create further training and capacity building activities, including new high fidelity synthetic data subsets.
Year(s) Of Engagement Activity 2024,2025
 
Description Dialogue with UCAS re. richer offers data 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact I worked with colleagues in the Department for Education to put forward a business case to the Universities and Colleges Admissions Service to extend the data currently available for linkage in LEO. This dialogue has continued intermittently, but has not yet resulted in any new data being shared. We are continuing to pursue these discussions.
Year(s) Of Engagement Activity 2023
 
Description Dialogue with Wage and Employment Dynamics team re. synthetic data 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The purpose of the dialogue was to share information about the creation and sharing of low fidelity synthetic data, to identify best practice. The discussion was used to inform the creation of a low fidelity synthetic version of the LEO data.
Year(s) Of Engagement Activity 2024
 
Description Expert coding group 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact We convened a group of existing LEO users to understand the extent of existing resources (such as code, derived variables) that could potentially be shared with the external research community, and to identify gaps that our grant could most usefully fill. This discussion resulted in a number of individuals sharing code to add to a code repository, and also helped to shape the focus of our efforts to create new exemplar code.
Year(s) Of Engagement Activity 2023
 
Description First LEO user group meeting/grant launch event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact We held an event to launch the grant, which also doubled as an introductory LEO user group meeting. The purpose was to bring together individuals from a range of organisations interested in LEO for the purposes of undertaking, commissioning or using research, to offer greater insight into the LEO data, and to share information regarding planned future developments and obtain feedback on these plans, including around prospective data linkages and training and capacity building activities.
Year(s) Of Engagement Activity 2022
 
Description NCRM DTRN webinar 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presented at a Data Resources Training Network - Exploring Educational Outcomes through National Datasets - to improve understanding of the LEO data amongst the research community, particularly postgraduate students, with a view to increasing use of the data for research in the public benefit.
Year(s) Of Engagement Activity 2024
URL https://www.ncrm.ac.uk/resources/video/?id=4977
 
Description Participation in LEO cross-government steering group meetings 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We participated in LEO cross-government steering group meetings in May 2024 and November 2024. The purpose of the group is to discuss the development of LEO data in England (and other UK nations) with a range of interested parties, including other government departments/bodies who contribute data to LEO (e.g. DWP, HMRC, Jisc). We shared information regarding our plans to develop a low fidelity synthetic version of the LEO and sought permission from all data owners to undertake this endeavour, which we secured. This enabled us to move ahead with plans to create and share a low fidelity synthetic version of the LEO data.
Year(s) Of Engagement Activity 2024
 
Description Presentation at Higher Education Access and Funding conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact We presented findings from an exemplar research project using the LEO data. The purpose was to inform public debate and policy decision-making in relation to higher education access and funding. The event highlighted commonalities and differences in the challenges faced by very different HE funding systems (e.g. England vs. US), and sparked ideas regarding how best funding and access challenges could be addressed.
Year(s) Of Engagement Activity 2024
URL https://global.georgetown.edu/events/higher-education-access-and-funding-challenges-and-policy-optio...
 
Description Project advisory board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Third sector organisations
Results and Impact We convened an advisory board to inform the work undertaken on the grant. The group consisted of representatives from third sector organisations (e.g. Edge Foundation, Education Policy Institute, FFT Education Datalab, National Foundation for Educational Research, Resolution Foundation, Sutton Trust), government departments (e.g. HM Treasury, the Office for National Statistics) and non-governmental organisations (e.g. Office for Students). These individuals represented organisations with expertise and/or interest in the LEO data, who could provide insight into the needs of data users, to ensure the grant delivered outputs of greatest value to the external research community. The group has met four times to date (in October 2022, May 2023, January 2024 and January 2025) and has informed the types of training and capacity building activities we will deliver, as well as future data linkages.
Year(s) Of Engagement Activity 2022,2023,2024,2025
 
Description RES DTP data workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presented at a Royal Economic Society Doctoral Training Partnership event - Databases for Research Economists - to improve understanding of the LEO data amongst the research community, particularly postgraduate students, with a view to increasing use of the data for research in the public benefit.
Year(s) Of Engagement Activity 2024
URL https://res.org.uk/committees/education-training-committee/res-doctoral-training-programme/expert-wo...
 
Description RES conference special session 2023 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact We co-organised a special session at the Royal Economic Society conference in Glasgow in 2023, one of the aims of which was to showcase the power of LEO data to a wide range of audiences. We presented work using LEO and also hosted a policy panel discussion, with members including Osama Rahman, former Chief Analyst and Chief Scientific Advisor at the Department for Education. The purpose of the event was to raise awareness of the LEO data and to showcase the type of policy-relevant research questions it can address.
Year(s) Of Engagement Activity 2023
 
Description RES conference special session 2024 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact We organised and ran a special session on Diversity and Productivity at the Royal Economic Society conference in Belfast in March 2024, with Osama Rahman, Director of ONS's Data Science Campus and Head of Diversity for the Government Economic Service. This showcased the value of LEO data for addressing policy-relevant questions and sparked interest among audience members about the data and how it could be used for research in future.
Year(s) Of Engagement Activity 2024
URL https://virtual.oxfordabstracts.com/event/4880/program
 
Description Updated gov.uk webpages for LEO 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact We worked with the LEO programme team at the Department for Education to update and reorganise the information shared regarding the LEO data via the gov.uk webpages. We provided summary information about what the data is, in addition to how to apply to access it, and made the information easier to find and navigate. The purpose was to increase the use of LEO data for research in the public benefit.
Year(s) Of Engagement Activity 2024
URL https://www.gov.uk/government/publications/longitudinal-education-outcomes-leo-dataset/longitudinal-...