Harnessing digital data to study 21st-century adolescence

Lead Research Organisation: UNIVERSITY OF CAMBRIDGE
Department Name: MRC Cognition and Brain Sciences Unit

Abstract

Almost every child in the UK is now living in both offline and online worlds. Just as they go to school every day, they also engage in online environments on a daily basis. A recent data collection by Ofcom showed that an average UK 16-year-old reports spending about 4 hours and 54 minutes a day online. In the USA, 46% of young people aged 13-17 say they are online 'almost constantly' while another 48% say they are online 'several times a day' (Pew, 2022). If we want to understand adolescent health in this digital age, we need to collect adequate data about what they do, see, and experience when spending time online.

We need to do so because online activities can influence a range of mental and physical health factors. For example, seeing self-harm content online could impact the mental health of a vulnerable adolescent who has just had a very difficult day at school. On the other hand, certain TikTok trends could impact another adolescent's activity levels or what they eat, impacting their physical health. Further, the communities young people find online can be a source of support, especially for those already struggling with their health.

However, currently no large UK data collection investment targeting adolescents is adequately collecting data about any of these processes. Many rely only on questionnaires to understand the online world. For example, they ask how much time teenagers spend on social media. However, time spent engaging with digital environments is not a good indicator of their impact, as we do not know what adolescents are seeing or doing online. Further, teenagers are very bad at estimating the time they spend on digital platforms, making current questionnaire measures woefully unreliable.

So how do we adequately capture interaction with digital environments to better understand adolescent health? One of the methods that has been especially promising and that has been developed in the past years through a variety of international grants is "Digital Data Donation". Digital Data Donation uses our data rights; as users of digital platforms, we are entitled to request our data from these platforms and are then free to share it with anyone we want. To allow us to exercise these rights, most digital platforms have introduced 'download your data' pages or services on their sites. Some in our team (Co-Is Boeschoten and Oberski) have developed infrastructure in The Netherlands to use these data download packages in an ethical and privacy enhancing fashion to collect data for research. This offers an opportunity to explore whether this infrastructure could be used to collect important data in the Adolescent Health Study.

In this project we will complete four workstreams to achieve this goal. First, we will get feedback from young people about the data donation infrastructure and test its feasibility in a classroom data collection environment. We will explore a research question that young people acting as our co-investigators throughout the project will help choose. Second, we will explore how we can not only collect digital data from young people, but also feed it back to them in fun and interactive data visualisations. We will work with young people and software engineers to explore what is possible and interesting, to ultimately develop a prototype of this data feedback tool. Third, this data feedback tool could also be used in a classroom environment, e.g., when teachers provide digital literacy lessons. We will consult teachers and develop a draft lesson plan integrating digital data donation and the data feedback tool. This could be used to motivate schools and teachers to engage in the broader Adolescent Health Study. Fourth, we will use what we learnt both in this project and in our other work to provide a briefing document setting out the pros and cons of a variety of digital data collection methods that can be used to inform the Adolescent Health Study and other future work.

Technical Summary

There have been a broad range of approaches developed to collect digital data. Platform-centric approaches include use of an API or web scraping for data collection. The effectiveness of these processes is often limited due to privacy concerns and sample selectivity. An alternative user-centric approach is the use of tracking apps. Although these are typically expensive to develop and maintain, they offer a solution to collect some types of data. For example, the US ABCD study and a project led by PI Orben have used an app to collect digital data from a subset of their participants, giving accurate information about time spent on screens. Yet both projects experienced limitations with young people finding the app privacy invasive and liable to glitches.

Recently, a more generic user-centric approach has gained popularity, namely donation of digital trace data also known as data donation. With data donation, participants request a digital copy of their personal archive at a platform of interest (known as a 'Data Download Package'), such as Instagram or WhatsApp. These platforms are legally obliged to share this due to GDPR data regulation. Our team Co-Is Boeschoten and Oberski have recently developed a workflow that allows researchers to analyse these Data Download Packages in a privacy enhancing way. In addition, they have developed this workflow into a reusable Open-Source software tool.

First, the participant visits a website where local extraction takes place. In practice, this means a Python script running locally at the device of the participant in their web-browser. It extracts only the data relevant for the current research project from their Data Download Package. After inspecting the extracted data, the participant can provide 'true' informed consent, after which the data is sent to the researcher. By making use of this local extraction step, only information that is relevant for the research question, and no sensitive data, is shared.

Publications

10 25 50
 
Description British Academy Workshop Series on Social Media Data Sharing
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Social Media Mechanisms Affecting Adolescent Mental Health (SoMe3)
Amount £1,985,735 (GBP)
Funding ID MR/X034925/1 
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 03/2024 
End 02/2028
 
Title Data Donation 
Description Infrastructure to allow kids to download data 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact Not yet had big impact 
 
Description Data Donation 
Organisation Utrecht University
Country Netherlands 
Sector Academic/University 
PI Contribution We are working on bringing data donation methodology to the UK
Collaborator Contribution They are providing expertise in data donation
Impact We have had one successful grant application
Start Year 2023
 
Description Adolescent advisory boards 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Patients, carers and/or patient groups
Results and Impact My team has run multiple advisory boards of young people for our studies and have presented to schools in the process.
Year(s) Of Engagement Activity 2023,2024
 
Description Interview for national newspaper by Amy Orben as well as team members 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact My team gave multiple interviews over this submission period, e.g. getting a headline in the Guardian
Year(s) Of Engagement Activity 2023,2024