FAIR TREATMENT: Federated analytics and AI Research across TREs for AdolescenT MENTal health

Lead Research Organisation: University of Cambridge
Department Name: UNLISTED

Abstract

Negative aspects of a young person's life can lead to poor mental health (MH). However, services are stretched so often intervene late, leaving young people to suffer with longer lasting / more severe problems. It is possible to spot patterns showing who needs professional help early. However this is difficult as the information needed is secured in different places (e.g.health, education, social care records and falls under the remit of different research councils (MRC, ESRC). The main problems are:
1) predictive models aren’t accurate enough: difficulties linking the above data together probably result in many factors being missed;
2) models built in one place may not be effective in others: we need a way to securely analyse data from different places;
3) there is no agreement on how to make sure data are managed safely, fairly and transparently.
To solve these problems we will:
1) combine two new technologies to demonstrate it is possible to analyse data across trusted research environments in different places and preserve individual’s privacy;
2) consult with patients, the public, organisations contributing data, and legal/ethics experts to agree the best way to oversee data use, ensuring it’s managed safely and fairly.
We can start quickly as we have been working together for three years and have already been funded to bring data together from education, social care and health services in Cambridgeshire and Peterborough, and the necessary ethical permissions are in place.

Technical Summary

Artificial intelligence research initiatives are supported by the NHS but current practical barriers prevent researchers making use of the substantial datasets potentially available. The barriers are both technical (e.g. secure data federation) and legal (e.g. lack of an appropriate/acceptable governance model). We will address these issues by a) combining existing technologies to create a federated trusted research environment (TRE) based on the Five Safes principles and b) developing a governance package to support federated data analysis.
Our motivation is the need to improve the effectiveness of mental health services for young people, which, against a background of increasing demand, are overstretched. Mental health problems can manifest in ways that are hard to detect from the perspective of a single agency (e.g. health service, school, social service) but we hypothesise they will be apparent when combining data from these services.
There are distinct challenges in combining such data at scale. We propose to: 1) provide a technical demonstration of approaches for federation across Trusted Research Environments (TREs) as a model for cross-council digital research environments; 2) provide a Use Case that requires such cross section integration and 3) examine the unique governance issues that arise, co-creating an aligned governance model that is acceptable to public, patients and data contributors.
The technology demonstrator will combine the BBSRC-funded InterMine platform with federation technology from Bitfount within the AIMES TRE. InterMine makes use of automatic code generation to create, from the underlying data model, the required database infrastructure, including APIs and UIs, and is designed to enable flexible and high performance querying. These features are useful in an environment with complex and evolving metadata. It is easy to configure interfaces that allow flexible querying while constraining access to data: we do this to implement governance rules with access controls provided by Bitfount, as an example of best practice. Using APIs from multiple InterMine instances, Bitfount technology will demonstrate secure privacy-preserving federation of queries to address the Use Case, for cohort identification, as well as for vertical federated learning protocols across the TREs.
We will use synthetic data generated from real-world data dictionaries and have the necessary permission to do this. This will allow much of the work to be done outside controlled environments and will generate freely available non-sensitive datasets.
Importantly, we have the ethical approval to integrate the necessary data in the Cambridge & Peterborough region as part of the Cam-CHILD project and, with Turing funding, we will work with Birmingham and Essex to establish analogous ethical approval in preparation for a possible DARE Phase 2 bid. Involving three different localities (Cambridge, Essex, and Birmingham) will allow us to demonstrate the generalisability of the project outcomes to different TREs and databases.
Linking and exploiting data from such diverse sources presents unique challenges. The governance work we propose will undertake extensive engagement with the public, patients, practitioners, data controllers and legal experts to examine all aspects of the public acceptability and legal framework for supporting federated analysis across multiple TREs. This will produce freely available public communications documents and legal templates.

Publications

10 25 50

 
Title Federated trusted research environment for linked data 
Description This is a trusted research environment that can securely host multiagency data and make it available for research purposes. It is able to federate with other TREs housing similar data to carry out privacy preserving federated analytics. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2022 
Impact It will be used as part of the cambridge children's hospital informatics research infrastructure. 
 
Description Presentation about MH & genomics for the Cambridge Children's and Illumina meeting about future direction 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presented to Illumina & CCH the role of genetics in child MH, and opportunities for use of data in digital early identification tools.
Year(s) Of Engagement Activity 2022
 
Description Presentation at Cambridge Children's Hospital Digital Board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact We presented on how linked data will be used within the children's hospital, and the role of the infrastructure we built with the grants in the research unit. We also influenced the development of their digital strategy.
Year(s) Of Engagement Activity 2022
 
Description Presentation to Lucy Chappell about digital work taking place in cambridge 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Presentation to Lucy Chappell about the issues we face in informatics and digital research, what cambridge uni is doing and what we feel are the key issues that currently need to be addressed to advance the field.
Year(s) Of Engagement Activity 2023
 
Description Recruitment of 200 members of the public to participate in out community of interest, contributing to supporting child health research 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact we used social media and support from charities to recruit over 200 members of the public willing to participate in PPI activities relating to children health research.
Year(s) Of Engagement Activity 2022,2023
 
Description TikTik video presenting the outcomes of patient and public involvement work with parents and young people about the use of linked data and AI for MH 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Tiktok video presented findings of the PPI process - acceptability of using linked data, what IG should be put in place and the recommendations for communications with the public.
Year(s) Of Engagement Activity 2023