📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Harnessing digital data to study 21st-century adolescence

Lead Research Organisation: University of Cambridge
Department Name: MRC Cognition and Brain Sciences Unit

Abstract

Almost every child in the UK is now living in both offline and online worlds. Just as they go to school every day, they also engage in online environments on a daily basis. A recent data collection by Ofcom showed that an average UK 16-year-old reports spending about 4 hours and 54 minutes a day online. In the USA, 46% of young people aged 13-17 say they are online 'almost constantly' while another 48% say they are online 'several times a day' (Pew, 2022). If we want to understand adolescent health in this digital age, we need to collect adequate data about what they do, see, and experience when spending time online.

We need to do so because online activities can influence a range of mental and physical health factors. For example, seeing self-harm content online could impact the mental health of a vulnerable adolescent who has just had a very difficult day at school. On the other hand, certain TikTok trends could impact another adolescent's activity levels or what they eat, impacting their physical health. Further, the communities young people find online can be a source of support, especially for those already struggling with their health.

However, currently no large UK data collection investment targeting adolescents is adequately collecting data about any of these processes. Many rely only on questionnaires to understand the online world. For example, they ask how much time teenagers spend on social media. However, time spent engaging with digital environments is not a good indicator of their impact, as we do not know what adolescents are seeing or doing online. Further, teenagers are very bad at estimating the time they spend on digital platforms, making current questionnaire measures woefully unreliable.

So how do we adequately capture interaction with digital environments to better understand adolescent health? One of the methods that has been especially promising and that has been developed in the past years through a variety of international grants is "Digital Data Donation". Digital Data Donation uses our data rights; as users of digital platforms, we are entitled to request our data from these platforms and are then free to share it with anyone we want. To allow us to exercise these rights, most digital platforms have introduced 'download your data' pages or services on their sites. Some in our team (Co-Is Boeschoten and Oberski) have developed infrastructure in The Netherlands to use these data download packages in an ethical and privacy enhancing fashion to collect data for research. This offers an opportunity to explore whether this infrastructure could be used to collect important data in the Adolescent Health Study.

In this project we will complete four workstreams to achieve this goal. First, we will get feedback from young people about the data donation infrastructure and test its feasibility in a classroom data collection environment. We will explore a research question that young people acting as our co-investigators throughout the project will help choose. Second, we will explore how we can not only collect digital data from young people, but also feed it back to them in fun and interactive data visualisations. We will work with young people and software engineers to explore what is possible and interesting, to ultimately develop a prototype of this data feedback tool. Third, this data feedback tool could also be used in a classroom environment, e.g., when teachers provide digital literacy lessons. We will consult teachers and develop a draft lesson plan integrating digital data donation and the data feedback tool. This could be used to motivate schools and teachers to engage in the broader Adolescent Health Study. Fourth, we will use what we learnt both in this project and in our other work to provide a briefing document setting out the pros and cons of a variety of digital data collection methods that can be used to inform the Adolescent Health Study and other future work.

Technical Summary

There have been a broad range of approaches developed to collect digital data. Platform-centric approaches include use of an API or web scraping for data collection. The effectiveness of these processes is often limited due to privacy concerns and sample selectivity. An alternative user-centric approach is the use of tracking apps. Although these are typically expensive to develop and maintain, they offer a solution to collect some types of data. For example, the US ABCD study and a project led by PI Orben have used an app to collect digital data from a subset of their participants, giving accurate information about time spent on screens. Yet both projects experienced limitations with young people finding the app privacy invasive and liable to glitches.

Recently, a more generic user-centric approach has gained popularity, namely donation of digital trace data also known as data donation. With data donation, participants request a digital copy of their personal archive at a platform of interest (known as a 'Data Download Package'), such as Instagram or WhatsApp. These platforms are legally obliged to share this due to GDPR data regulation. Our team Co-Is Boeschoten and Oberski have recently developed a workflow that allows researchers to analyse these Data Download Packages in a privacy enhancing way. In addition, they have developed this workflow into a reusable Open-Source software tool.

First, the participant visits a website where local extraction takes place. In practice, this means a Python script running locally at the device of the participant in their web-browser. It extracts only the data relevant for the current research project from their Data Download Package. After inspecting the extracted data, the participant can provide 'true' informed consent, after which the data is sent to the researcher. By making use of this local extraction step, only information that is relevant for the research question, and no sensitive data, is shared.

Publications

10 25 50
 
Description British Academy Workshop Series on Social Media Data Sharing
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Submitted to two calls for evidence from Ofcom and the Department for Science Innovation and Technology
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact Better data sharing
 
Description Various meetings with senior policymakers incl. Director General at Department for science innovation and technology and presentations to MPs etc.
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact Allowed policymakers to understand potential of data donation
 
Description CERES Sub-Award: Connecting the EdTech Research EcoSystem
Amount $250,000 (USD)
Organisation Jacobs Foundation 
Sector Charity/Non Profit
Country Switzerland
Start 04/2024 
End 04/2027
 
Description Promoting Well-Being in Preteens, Adolescents and Young Adults: Toward Improved Social Media Policies
Amount € 1,606,423 (EUR)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 03/2025 
End 02/2028
 
Description Real-time and randomized tests of social media and mental health interplay in early adolescence
Amount $3,742,991 (USD)
Organisation National Institutes of Health (NIH) 
Sector Public
Country United States
Start 09/2024 
End 09/2030
 
Description Social Media Mechanisms Affecting Adolescent Mental Health (SoMe3)
Amount £1,985,735 (GBP)
Funding ID MR/X034925/1 
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 03/2024 
End 02/2028
 
Description The environment and eating disorders: developing novel measures and hypotheses through inter-disciplinary collaborations.
Amount £1,093,377 (GBP)
Funding ID MR/X030725/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2023 
End 09/2026
 
Title Data Donation 
Description Infrastructure to allow kids to download data 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Not yet had big impact 
 
Description Born in Bradford Data Donation 
Organisation Bradford Institute for Health Research (BIHR)
Department Born in Bradford
Country United Kingdom 
Sector Public 
PI Contribution I am senior advisor to data donation efforts in the Born in Bradford cohort and get access to data
Collaborator Contribution Born in Bradford is one of the UKs leading cohort studies
Impact N/A
Start Year 2024
 
Description Collaboration with over 7 universities working on data donation (Georgia Tech, Stanford, UCLA, York, Bradford, UCL, Sydney) 
Organisation Georgia Institute of Technology
Country United States 
Sector Academic/University 
PI Contribution Data from study collected
Collaborator Contribution Help analyse data or advise on how to use data donation in their studies
Impact N/A
Start Year 2024
 
Description Data Donation 
Organisation Utrecht University
Country Netherlands 
Sector Academic/University 
PI Contribution We are working on bringing data donation methodology to the UK
Collaborator Contribution They are providing expertise in data donation
Impact We have had one successful grant application
Start Year 2023
 
Description ESRC Smart Data Research UK - Social Data Donation Service 
Organisation University of York
Country United Kingdom 
Sector Academic/University 
PI Contribution I am now key advisor for social media data donation for the UKRI SDDS
Collaborator Contribution Senior advisor, allowing data donation to take place in this 7 Million GBP investment.
Impact N/A
Start Year 2024
 
Description MRC Adolescent Health Study 
Organisation Medical Research Council (MRC)
Country United Kingdom 
Sector Public 
PI Contribution I have provided a briefing paper on how to measure social media use in this new cohort
Collaborator Contribution New leading cohort will be collected
Impact One-pager
Start Year 2024
 
Description Adolescent advisory boards 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Patients, carers and/or patient groups
Results and Impact My team has run multiple advisory boards of young people for our studies and have presented to schools in the process.
Year(s) Of Engagement Activity 2023,2024,2025
 
Description Advisory Boards (various) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact My team engages in various advisory boards across government and the research sector, more information can be provided on request.
Year(s) Of Engagement Activity 2025
 
Description Interview for national newspaper by Amy Orben as well as team members 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact My team gave multiple interviews over this submission period, e.g. getting a headline in the Guardian
Year(s) Of Engagement Activity 2023,2024
 
Description Various national and international interviews about data access and data donation incl features in Economist 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact My team does media interviews every week that led to things like a Guardian Front Page and features in the Economist
Year(s) Of Engagement Activity 2024,2025
 
Description Various policy roundtables engaged with (incl at No 10, EU parliament) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact My team engages in policy roundtables at least once a month including notably No 10, EU Parliament, Government Commissions both nationally and internationally. A full list can be provided on request.
Year(s) Of Engagement Activity 2024,2025