FAIR TREATMENT: Federated analytics and AI Research across TREs for AdolescenT MENTal health

Lead Research Organisation: University of Cambridge
Department Name: UNLISTED


Negative aspects of a young person's life can lead to poor mental health (MH). However, services are stretched so often intervene late, leaving young people to suffer with longer lasting / more severe problems. It is possible to spot patterns showing who needs professional help early. However this is difficult as the information needed is secured in different places (e.g.health, education, social care records and falls under the remit of different research councils (MRC, ESRC). The main problems are:
1) predictive models aren’t accurate enough: difficulties linking the above data together probably result in many factors being missed;
2) models built in one place may not be effective in others: we need a way to securely analyse data from different places;
3) there is no agreement on how to make sure data are managed safely, fairly and transparently.
To solve these problems we will:
1) combine two new technologies to demonstrate it is possible to analyse data across trusted research environments in different places and preserve individual’s privacy;
2) consult with patients, the public, organisations contributing data, and legal/ethics experts to agree the best way to oversee data use, ensuring it’s managed safely and fairly.
We can start quickly as we have been working together for three years and have already been funded to bring data together from education, social care and health services in Cambridgeshire and Peterborough, and the necessary ethical permissions are in place.

Technical Summary

Artificial intelligence research initiatives are supported by the NHS but current practical barriers prevent researchers making use of the substantial datasets potentially available. The barriers are both technical (e.g. secure data federation) and legal (e.g. lack of an appropriate/acceptable governance model). We will address these issues by a) combining existing technologies to create a federated trusted research environment (TRE) based on the Five Safes principles and b) developing a governance package to support federated data analysis.
Our motivation is the need to improve the effectiveness of mental health services for young people, which, against a background of increasing demand, are overstretched. Mental health problems can manifest in ways that are hard to detect from the perspective of a single agency (e.g. health service, school, social service) but we hypothesise they will be apparent when combining data from these services.
There are distinct challenges in combining such data at scale. We propose to: 1) provide a technical demonstration of approaches for federation across Trusted Research Environments (TREs) as a model for cross-council digital research environments; 2) provide a Use Case that requires such cross section integration and 3) examine the unique governance issues that arise, co-creating an aligned governance model that is acceptable to public, patients and data contributors.
The technology demonstrator will combine the BBSRC-funded InterMine platform with federation technology from Bitfount within the AIMES TRE. InterMine makes use of automatic code generation to create, from the underlying data model, the required database infrastructure, including APIs and UIs, and is designed to enable flexible and high performance querying. These features are useful in an environment with complex and evolving metadata. It is easy to configure interfaces that allow flexible querying while constraining access to data: we do this to implement governance rules with access controls provided by Bitfount, as an example of best practice. Using APIs from multiple InterMine instances, Bitfount technology will demonstrate secure privacy-preserving federation of queries to address the Use Case, for cohort identification, as well as for vertical federated learning protocols across the TREs.
We will use synthetic data generated from real-world data dictionaries and have the necessary permission to do this. This will allow much of the work to be done outside controlled environments and will generate freely available non-sensitive datasets.
Importantly, we have the ethical approval to integrate the necessary data in the Cambridge & Peterborough region as part of the Cam-CHILD project and, with Turing funding, we will work with Birmingham and Essex to establish analogous ethical approval in preparation for a possible DARE Phase 2 bid. Involving three different localities (Cambridge, Essex, and Birmingham) will allow us to demonstrate the generalisability of the project outcomes to different TREs and databases.
Linking and exploiting data from such diverse sources presents unique challenges. The governance work we propose will undertake extensive engagement with the public, patients, practitioners, data controllers and legal experts to examine all aspects of the public acceptability and legal framework for supporting federated analysis across multiple TREs. This will produce freely available public communications documents and legal templates.


10 25 50

Description Presented work to shadow minister for innovation and technology
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Description Timely: towards early identification of child and adolescent mental health problems
Amount £300,000 (GBP)
Funding ID T2-15 
Organisation Alan Turing Institute 
Sector Academic/University
Country United Kingdom
Start 09/2021 
End 03/2022
Description Towards early identification of child and adolescent mental health problems
Amount £297,000 (GBP)
Funding ID T2-15 
Organisation Alan Turing Institute 
Sector Academic/University
Country United Kingdom
Start 09/2021 
End 06/2022
Description Transforming child mental health: co-designing, building and evaluating a digitally enabled, personalised, prevention pathway
Amount £3,080,011 (GBP)
Funding ID MR/X034917/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 04/2024 
End 04/2031
Title Building an infrastructure able to integrate health, education and social data relating to children for research purposes. 
Description 1. we have developed a successful model to enable the governance and IG to be put in place to support multi-agency working. This is accompanied by a toolkit describing the steps to securing ethics for a research program. 2. Brought together two technologies to enable the build of a trusted research environment (TRE) including multi-agency data. 3. Created the data architecture to enable the build of a TRE including multi-agency childrens data 
Type Of Material Improvements to research infrastructure 
Year Produced 2021 
Provided To Others? No  
Impact The approach will be published in 2022. 
Title Created federated informatics network for research purposes for paediatrics 
Description integrated regional data and developed software to enable its safe access. Will be available for others in the future. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? No  
Impact We have been adopted by the mental health mission to provide digital infrastructure to enable paeds research for the UK 
Title CADRE 
Description Linked database including paeds data from health, education and social care. Will be available to others in the future. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact None yet, its a WIP. 
Title Child mental health services database 
Description The database is currently being finalised. It includes de-identified data relating to four years of patient level child & adoleascent MH services (CAMHS) data, relating to 20 sites. We are building this using InterMine - this is enabling us to translate a genetics informatics platform into one that can be used for NHS service data, 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact The database is in its final stages of being completed. However it is being used to enable a 20 site case control of the effectiveness of a new model of care for CAMHS (THRIVE). We are using the process to support the process of translating InterMine into an informatics platform for health services data, as part of the MRC grant. 
Title Linking data relating to health, education and social care for all children in WALEs within the SAIL/ADP databank. 
Description WE linked 17 databases relating to children in WALES for the first time. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? No  
Impact We are able to carry out epidemiological research on this database, and build early identification models for child health. 
Description Building capacity for federated AI for adolescent mental health. 
Organisation University of Birmingham
Department School of Psychology Birmingham
Country United Kingdom 
Sector Academic/University 
PI Contribution We are supporting Birmingham to build a TRE locally, using the methods developed by the MRC Timely grant. We will work together to federate our TREs, creating a mechanism for external validation of our early identification models. We are also expanding recruitment of our child and adolescent cohort (11-15y) to include birmingham, so this data will be included in our models as well.
Collaborator Contribution Birmingham supported us in drafting an application to the Turing Institute which was successful. We are currently drafting an HDRUK/UKRI application for the sprints.
Impact Successful application to Turing Foundation for £300,000 funding. Application to HDRUK/UKRI sprints Started recruitment of a cohort of adolescents into our genetic cohort for includion into the database. We are doign a lot of work on inequalitites and reducing these in datasets - to make them more representative.
Start Year 2021
Description Collaboration to build capacity for federated AI - partnership with Essex University 
Organisation University of Essex
Department Department of Psychology
Country United Kingdom 
Sector Academic/University 
PI Contribution Collaboration to support Essex create a TRE based on the model created by Cam-CHILD during the MRC Engagement Award Collaboraiton with us on a successful grant
Collaborator Contribution Will create another TRE, enabling us to federate and buld capacity for adolescent MH research
Impact Successful application for funding to Turing Institution Application to the HDR UK/ UKRI DARE sprints
Start Year 2021
Description Collaboration with charity to develop integrated data resource 
Organisation Anna Freud Centre
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution AFC are partnering with the research team to build an informatics platform integrating health and social care data. This involves a workstream involving members of the public and service users to explore the acceptability of the use of electronic health and care records data. We are also collaborating to create the first general population cohort of children and adolescents for the NIHR BioResource. The PPI team is partially funded by the MRC Adolescent Engagement Award I hold. We provide a link into the research including communication and training.
Collaborator Contribution Helped to recruit a Young Champion Academic leadership of PPI workstream Schools team is helping to identify and liaise with schools to support recruitment.
Impact - active PPI group contributing to research - secured funding for study co-ordinator from NIHR BioResource - it is multidisciplinary (psychiatry, genetics, BRC, health services research, informatics, PPI)
Start Year 2020
Description Illumina 
Organisation Illumina Inc.
Department Illumina
Country United Kingdom 
Sector Private 
PI Contribution Contact with Illumina to discuss with them the value of research in child MH genetics.
Collaborator Contribution They are contributing £100k of whole genome sequencing.
Impact UKRI Future Leaers Fellowship
Start Year 2023
Description Microsoft research 
Organisation Microsoft Research
Country Global 
Sector Private 
PI Contribution Collaboration to design digital tools
Collaborator Contribution They are providing me with mentorsip, access to training for team and I, and direct input to project work.
Impact UKRI luture leadership fellowship
Start Year 2023
Description Parternship with NIHR BioResource 
Organisation National Institute for Health Research
Department National Institute for Health Research (NIHR) BioResource
Country United Kingdom 
Sector Academic/University 
PI Contribution I have been appointed Clinical Lead of the NIHR Children and Young People's BioResource. I am helping to establish a novel cohort of children and young people to join the BioResource. I have also secured the partnership of a leading national charity (Anna Freud Centre) to support this work. We will be taking a novel approach and recruiting children via schools.
Collaborator Contribution The aim of my work with the BioR is to include genetic data into the linked data resource we are currently building. We will work with them to determine how best to link this data to the linked platform, addressing information governance, technical, security and legal issues.
Impact Creation of a national Expert Working Group, partnership with a leading national charity, commissioning of a schools PPI group and young people's PPI group.
Start Year 2020
Description Partnership with Bitfount - start up company that specialises in federated AI 
Organisation Bitfount Ltd
Country United Kingdom 
Sector Private 
PI Contribution We have supported Bitfount to understand what is required to build a Trusted Research environment and how federated analytics is important. We have supported them to understand the 'five safes' of research data. They also have an additional clinical example to include in their portfolio.
Collaborator Contribution Bitfount will provide the capability to carry out privacy preserving federated analytics across a range of TREs. This is a critical functional requirement to enable the external validation of the early identification AI models we are building, as well as providing larger sample sizes.
Impact We have drafted an application to the UKRI/HDRUK DARE sprint program.
Start Year 2021
Description Partnership with CPFT Mental Health trust to creating linked health & social care database 
Organisation Cambridgeshire and Peterborough NHS Foundation Trust
Country United Kingdom 
Sector Public 
PI Contribution Provide academic input into PPI group, providing access to linked database.
Collaborator Contribution Access to CPFT data, secure data bank, support with PPI
Impact Publication on digital working Secured an MRC grant together
Start Year 2019
Description Partnership with CUH acute hospital to create linked health and social care database 
Organisation Cambridge University Hospitals NHS Foundation Trust
Country United Kingdom 
Sector Public 
PI Contribution Building a linked dataset enabling CUH to use its data Analysis of their children's A&E data to support novel pathways
Collaborator Contribution Access to EPIC data
Impact MRC grant secured
Start Year 2020
Description Partnership with CUH acute hospital to create linked health and social care database 
Organisation Cambridge University Hospitals NHS Foundation Trust
Country United Kingdom 
Sector Public 
PI Contribution Building a linked dataset enabling CUH to use its data Analysis of their children's A&E data to support novel pathways
Collaborator Contribution Access to EPIC data
Impact MRC grant secured
Start Year 2020
Description Partnership with Community Health Services to creat linked database 
Organisation Cambridgeshire Community Services NHS Trust
Country United Kingdom 
Sector Public 
PI Contribution We have provided them with training to de-identify their data using a validated software (CRATE). We have provided facilitated workshops to support the identification of hte data that is required.
Collaborator Contribution The clinical team and informatics teams are working with us to develop a linked dataset. This has included the clinical lead and informatics leads working closely to: map the databases, identify the datasets that we require, they are undertaking training to enable them to de-identify the data locally, the data will be transferred to us periodically. They are contributing to the work to develop the live linked database.
Impact I will be completing a clinical training post inthe service as a direct result of this collaboration. I also hope to build a clinical service in theri organisation as a direct result of this work.
Start Year 2019
Description Partnership with department of Genetics 
Organisation University of Cambridge
Department Department of Genetics
Country United Kingdom 
Sector Academic/University 
PI Contribution We have secured a grant that has led to funding for the department. Providing education in health services structure and related informatics.
Collaborator Contribution They are providing access to a Wellcome Trust Funded informatics platform that we are adapting for use with healthcare data,
Impact We are building the cambridge child health informatics and linked data platform (Cam-CHILD).
Start Year 2020
Description Partnership with leading givernance and data security consultancy 
Organisation Kaleidoscope
Country United Kingdom 
Sector Private 
PI Contribution We have built a partnership providing the consultancy with a novel challenge and collaboration with Uni of Cambridge to solve some of the most challenging data IG problems - the access, sharing, linkage and use of sensitive children's data for research purposes.
Collaborator Contribution They are supporting us as we work with partners to develop a suitable IG model.
Impact Application to HDRUK/UKRI DARE sprints Data flow diagrams and we are working towards developing a governance model
Start Year 2021
Description Partnership with local authority to create linked database 
Organisation Cambridgeshire County Council
Department Public Health Service; Cambridgeshire County Council
Country United Kingdom 
Sector Public 
PI Contribution We have provided training and support to develop a method of mapping out data required for the linked dataset.
Collaborator Contribution - service, IT and information systems leads are working with us to map out the data requirements for the database. - will pseudonymise data - will extract data for the database and update this periodically - will contribute to governance of subsequent dataset
Impact We have submitted an NIHR application to the 'Unlocking Local Authority Data' call
Start Year 2019
Description Partnership with the Department of Engineering 
Organisation University of Cambridge
Department Department of Engineering
Country United Kingdom 
Sector Academic/University 
PI Contribution We have led the applicaiton of the 'Systems Thinking' approach to the development of early identification tools for child MH.
Collaborator Contribution Senior Academic attends all meetings and is leading a workstream on how best to take a systems approach to early identification in adolescent MH.
Impact Submitted and secured an MRC adolescent engagement award.
Start Year 2020
Title Federated trusted research environment for linked data 
Description This is a trusted research environment that can securely host multiagency data and make it available for research purposes. It is able to federate with other TREs housing similar data to carry out privacy preserving federated analytics. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2022 
Impact It will be used as part of the cambridge children's hospital informatics research infrastructure. 
Description BBC news coverage for fellowship 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact When the award was announced it got interest from the BBC, and it was featured in a national article, as well as on the regional news.
Year(s) Of Engagement Activity 2023
URL https://www.bbc.co.uk/news/uk-england-cambridgeshire-67624048
Description Presentation about MH & genomics for the Cambridge Children's and Illumina meeting about future direction 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presented to Illumina & CCH the role of genetics in child MH, and opportunities for use of data in digital early identification tools.
Year(s) Of Engagement Activity 2022
Description Presentation at Cambridge Children's Hospital Digital Board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact We presented on how linked data will be used within the children's hospital, and the role of the infrastructure we built with the grants in the research unit. We also influenced the development of their digital strategy.
Year(s) Of Engagement Activity 2022
Description Presentation to Lucy Chappell about digital work taking place in cambridge 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Presentation to Lucy Chappell about the issues we face in informatics and digital research, what cambridge uni is doing and what we feel are the key issues that currently need to be addressed to advance the field.
Year(s) Of Engagement Activity 2023
Description Public engagement with DARE UK sprint program 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact We participated in the DARE UK Sprint launch public launch and presented the aim and purpose of our project to the public and other audiences.
Year(s) Of Engagement Activity 2022
Description Recruitment of 200 members of the public to participate in out community of interest, contributing to supporting child health research 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact we used social media and support from charities to recruit over 200 members of the public willing to participate in PPI activities relating to children health research.
Year(s) Of Engagement Activity 2022,2023
Description School network presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact We attended the Birmingham local authority school wellbeing network to present on the NIHR Young People's BioResource and encourage schools to participate.
Year(s) Of Engagement Activity 2022
Description TikTik video presenting the outcomes of patient and public involvement work with parents and young people about the use of linked data and AI for MH 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Tiktok video presented findings of the PPI process - acceptability of using linked data, what IG should be put in place and the recommendations for communications with the public.
Year(s) Of Engagement Activity 2023