Development and enhancement of Longitudinal Education Outcomes (LEO) data
Lead Research Organisation:
University College London
Department Name: Learning and Leadership
Abstract
Understanding how much individuals and society benefit from different education and training courses is vital for governments weighing up investments in education and skills. Access to data with rich information on education, training and earnings is crucial to estimating these benefits, and having a large enough sample to consider whether the benefits vary across different groups (e.g. by socio-economic background) or different areas of the country is crucial in informing important policy decisions, such as the extent to which investment in education and skills for disadvantaged individuals or those living in 'left-behind' areas will help 'level up' the country.
We have access to such data in England, known as the Longitudinal Education Outcomes (LEO) data, which links together education records, benefit records and tax records. These data have provided crucial new insight into how much individuals and areas benefit from higher education, for example. But, to date, access to these data has been restricted to a relatively small number of individuals and organisations. The data are becoming more widely available, but the number and complexity of the datasets included as part of LEO presents a substantial barrier to new users, as it means they have to invest a lot of time in understanding the data before they are able to use it effectively, and may mean some important research questions go unanswered as a result.
Moreover, the data could be even more useful if we were able to incorporate additional information. For example, if we could include information on the places where individuals work - and who works with whom - then we could understand how much investment in education and training benefits people's colleagues, and the businesses in which they work. Similarly, if we were able to link in information about which individuals applied to university, and where, and compare this to the offers they received and where they went, we could understand more about the role of individual preferences and university decisions in generating the strong links evident between socio-economic background, education choices and later outcomes.
Our project will fill both of these gaps. Specifically, it will:
1. Enhance existing LEO data by:
a. Creating a simplified and consistent set of variables summarising important pieces of information from the data, such as measures of educational attainment, employment and earnings, that researchers can use to help get them started with their analysis.
b. Linking in new contextual data, such as about the areas in which individuals live.
c. Sharing documentation, code and metadata for these newly created variables (in a. and b.)
d. Creating and running an online forum through which current and potential users can find information about the data and future developments, and seek help from other users.
e. Providing introductory and advanced training events to build capacity in use of the data.
2. Link in new data, including on the places where individuals work and, for those who applied, information on their university applications and offers, and:
a. Incorporate this data into each of the elements outlined under 1. for existing LEO data, i.e. produce documentation and consistent variables for these new data; merge in additional relevant contextual data (e.g. on the 'quality' of the higher education institutions applied for and attended); and build awareness and capacity in use of this new data by incorporating information into the online forum and providing bespoke training events and resources.
b. Undertake new research to demonstrate the value of this new data in addressing important policy-relevant questions, such as on the link between education and business productivity, and whether policies which give lower university entry offers to students from more disadvantaged backgrounds are effective in improving outcomes for these individuals.
We have access to such data in England, known as the Longitudinal Education Outcomes (LEO) data, which links together education records, benefit records and tax records. These data have provided crucial new insight into how much individuals and areas benefit from higher education, for example. But, to date, access to these data has been restricted to a relatively small number of individuals and organisations. The data are becoming more widely available, but the number and complexity of the datasets included as part of LEO presents a substantial barrier to new users, as it means they have to invest a lot of time in understanding the data before they are able to use it effectively, and may mean some important research questions go unanswered as a result.
Moreover, the data could be even more useful if we were able to incorporate additional information. For example, if we could include information on the places where individuals work - and who works with whom - then we could understand how much investment in education and training benefits people's colleagues, and the businesses in which they work. Similarly, if we were able to link in information about which individuals applied to university, and where, and compare this to the offers they received and where they went, we could understand more about the role of individual preferences and university decisions in generating the strong links evident between socio-economic background, education choices and later outcomes.
Our project will fill both of these gaps. Specifically, it will:
1. Enhance existing LEO data by:
a. Creating a simplified and consistent set of variables summarising important pieces of information from the data, such as measures of educational attainment, employment and earnings, that researchers can use to help get them started with their analysis.
b. Linking in new contextual data, such as about the areas in which individuals live.
c. Sharing documentation, code and metadata for these newly created variables (in a. and b.)
d. Creating and running an online forum through which current and potential users can find information about the data and future developments, and seek help from other users.
e. Providing introductory and advanced training events to build capacity in use of the data.
2. Link in new data, including on the places where individuals work and, for those who applied, information on their university applications and offers, and:
a. Incorporate this data into each of the elements outlined under 1. for existing LEO data, i.e. produce documentation and consistent variables for these new data; merge in additional relevant contextual data (e.g. on the 'quality' of the higher education institutions applied for and attended); and build awareness and capacity in use of this new data by incorporating information into the online forum and providing bespoke training events and resources.
b. Undertake new research to demonstrate the value of this new data in addressing important policy-relevant questions, such as on the link between education and business productivity, and whether policies which give lower university entry offers to students from more disadvantaged backgrounds are effective in improving outcomes for these individuals.
Description | Development and delivery of introduction to LEO course |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | We improved the knowledge of participants on our course, enabling them to apply for and use LEO data more effectively. |
URL | https://www.ucl.ac.uk/ioe/departments-and-centres/centres/centre-education-policy-and-equalising-opp... |
Description | Development and delivery of introduction to NPD course |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Impact | We improved the knowledge of participants on our course, enabling them to apply for and use NPD, LEO, GRADE and GUiE data more effectively. |
URL | https://www.ucl.ac.uk/ioe/departments-and-centres/centres/centre-education-policy-and-equalising-opp... |
Description | Partnership with the Department for Education LEO Programme team |
Organisation | Department for Education |
Country | United Kingdom |
Sector | Public |
PI Contribution | All members of the UCL team funded by this grant are on part-time secondment to the LEO Programme team at the Department for Education (DfE), who are responsible for the development and sharing of LEO data. We are fully embedded within the LEO Programme team, with our work on this grant supporting them to deliver their objectives to enhance the usability and use of LEO data amongst the external research community. |
Collaborator Contribution | The LEO Programme team at DfE are responsible for developing and enhancing the LEO data and supporting resources to serve the needs of researchers both inside and outside government. They are the gatekeepers to the LEO data and provide the conduit through which we can access data to derive new variables and create shareable code, improve documentation, and liaise with contacts elsewhere within DfE (e.g. to determine the best approach to creating and sharing synthetic data) and in other government departments (e.g. ONS, to explore how we can share code with researchers inside the SRS). |
Impact | So far the partnership has resulted in co-delivery of events detailed elsewhere, e.g. the ADRUK public engagement discussion around LEO in September 2022; the grant launch event in October 2022; the ADRUK pre-conference workshop in November 2023; the LEO training course in March 2024. This collaboration is with a non-academic partner whose expertise centres around project management and delivery, so is multi-disciplinary. |
Start Year | 2022 |
Description | ADRUK Cambridge workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Policymakers/politicians |
Results and Impact | I gave a talk as part of an ADRUK Workshop on Administrative Data for Public Policy Research organised by the University of Cambridge. The purpose was to raise awareness of different types of education administrative data and how they could be used to address policy-relevant research questions. Audiences were engaged and asked a number of questions about the data. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.educ.cam.ac.uk/events/workshops/adruk/ |
Description | ADRUK pre-conference workshop 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Policymakers/politicians |
Results and Impact | We successfully applied to host a pre-conference workshop at the ADRUK conference in November 2023. This also acted as a second LEO user group meeting. The purpose was to bring together the LEO user community again, share developments and receive feedback on future plans. The main outcome was to raise awareness of the new iteration of LEO data that had recently been made available to external researchers. |
Year(s) Of Engagement Activity | 2023 |
URL | https://virtual.oxfordabstracts.com/#/event/4218/program?session=79378&s=269 |
Description | ADRUK public engagement discussion on LEO |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | ADRUK convened a group of third sector organisations whose work could usefully be informed by LEO data/research to understand their perceptions of its value, and any risks they identified in it being used for research purposes. A report summarising the discussion was subsequently published (URL below). |
Year(s) Of Engagement Activity | 2022 |
URL | https://www.adruk.org/fileadmin/uploads/adruk/Documents/PE_reports_and_documents/LEO_report_key_mess... |
Description | Blog to accompany LEO I2 release |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Media (as a channel to the public) |
Results and Impact | I wrote a blog for the ADRUK website to accompany the launch of the second iteration of LEO data. The aim was to promote the data, particularly the new datasets that had been linked in, drawing attention to its potential to address policy-relevant research questions for the public good. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.adruk.org/news-publications/news-blogs/new-longitudinal-education-outcomes-data-made-ava... |
Description | Dialogue with FFT re. synthetic data |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | We met with representatives from FFT Education Datalab to discuss plans for the creation of NPD and LEO synthetic data. Plans were shared and it was agreed that we would keep each informed about our work, but that the goals of the activities being undertaken by us and them were complementary and thus mutually beneficial for the external research community. |
Year(s) Of Engagement Activity | 2024 |
Description | Dialogue with UCAS re. richer offers data |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | I worked with colleagues in the Department for Education to put forward a business case to the Universities and Colleges Admissions Service to extend the data currently available for linkage in LEO. This dialogue has continued intermittently, but has not yet resulted in any new data being shared. We are continuing to pursue these discussions. |
Year(s) Of Engagement Activity | 2023 |
Description | Expert coding group |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | We convened a group of existing LEO users to understand the extent of existing resources (such as code, derived variables) that could potentially be shared with the external research community, and to identify gaps that our grant could most usefully fill. This discussion resulted in a number of individuals sharing code to add to a code repository, and also helped to shape the focus of our efforts to create new exemplar code. |
Year(s) Of Engagement Activity | 2023 |
Description | First LEO user group meeting/grant launch event |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | We held an event to launch the grant, which also doubled as an introductory LEO user group meeting. The purpose was to bring together individuals from a range of organisations interested in LEO for the purposes of undertaking, commissioning or using research, to offer greater insight into the LEO data, and to share information regarding planned future developments and obtain feedback on these plans, including around prospective data linkages and training and capacity building activities. |
Year(s) Of Engagement Activity | 2022 |
Description | Project advisory board |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Third sector organisations |
Results and Impact | We convened an advisory board to inform the work undertaken on the grant. The group consisted of representatives from third sector organisations (e.g. Edge Foundation, Education Policy Institute, FFT Education Datalab, National Foundation for Educational Research, Resolution Foundation, Sutton Trust), government departments (e.g. HM Treasury, the Office for National Statistics) and non-governmental organisations (e.g. Office for Students). These individuals represented organisations with expertise and/or interest in the LEO data, who could provide insight into the needs of data users, to ensure the grant delivered outputs of greatest value to the external research community. The group has met three times to date (in October 2022, May 2023 and January 2024) and has informed the types of training and capacity building activities we will deliver, as well as future data linkages. |
Year(s) Of Engagement Activity | 2022,2023,2024 |
Description | RES conference special session 2023 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | We co-organised a special session at the Royal Economic Society conference in Glasgow in 2023, one of the aims of which was to showcase the power of LEO data to a wide range of audiences. We presented work using LEO and also hosted a policy panel discussion, with members including Osama Rahman, former Chief Analyst and Chief Scientific Advisor at the Department for Education. The purpose of the event was to raise awareness of the LEO data and to showcase the type of policy-relevant research questions it can address. |
Year(s) Of Engagement Activity | 2023 |