Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMatter)

Lead Research Organisation: University of Dundee
Department Name: UNLISTED

Abstract

Trusted Research Environments (TREs) provide a secure location for researchers to analyse data for projects in the public interest e.g. providing information to SAGE to fight the COVID-19 pandemic. TRE staff check outputs to prevent disclosure of individuals’ confidential data.

TREs have historically supported only classical statistical data analysis. There is an increasing need to also facilitate the training of Artificial Intelligence (AI) models. AI has many valuable applications e.g., spotting human errors, streamlining processes, helping with repetitive tasks and supporting clinical decision making. The trained models then need to be exported from TREs for use. The size and complexity of AI models presents significant challenges for the disclosure-checking process. Models may be susceptible to external hacking: complicated methods to reverse engineer the learning process to find out about the data used for training, with more potential to lead to re-identification than conventional statistical methods.

With input from public representatives, GRAIMatter will assess a range of tools and methods to support TREs to assess output from AI methods for potentially identifiable information, investigate the legal and ethical implications and controls, and produce a set of guidelines and recommendations to support all TREs with export controls of AI algorithms.

Technical Summary

TREs are widely, and increasingly being used to support statistical analysis of sensitive data across a range of sectors (e.g., education, police, tax and health) as they enable secure and transparent research whilst protecting data confidentiality.

There is increasing desire from academia and industry to train AI models in TREs. The field of AI is developing quickly with applications including spotting human errors, streamlining processes, task automation and decision support. These more complex AI models require more information to describe and reproduce, increasing the possibility that sensitive information regarding secure data can be inferred from such descriptions. TREs do not have mature processes and controls against these risks. This is a complex topic, and it is unreasonable to expect all researchers to be aware of all risks or that TRE researchers have addressed these risks in AI-specific training.

We aim to address this problem by developing a set of usable recommendations for TREs to use to guard against the additional risks when disclosing trained AI models from TREs. We will draw upon our internationally recognised expertise in TREs, AI, data governance, disclosure control, data security and confidentiality, law and ethics.

WP1: Quantitative Assessment of Risk: a detailed empirical study to evaluate vulnerabilities of a selection of machine learning models. We will explore different models, hyper-parameter settings and training algorithms over common data types.

WP2: Controls and Evaluation of Tools: evaluation of effectiveness of a range of tools for addressing vulnerabilities identified in WP1 technically (do they accurately quantify disclosure risks) and organisationally (what is their impact on TRE output checking, and in assisting researchers to produce checkable ‘safe’ outputs).

WP3: Legal and Ethical Implications: a legal and ethical analysis of information from WP1/2 to develop a framework for regulation of AI models developed in a protected environment, and identification of aspects of existing legal and regulatory frameworks requiring reform to facilitate the use and export of AI models from protected environments.

WP4: PPIE: 4 workshops run by lay Co-Is seeking input from public/patients on our approach. We will produce lay summaries of project outputs and support and train our researcher team onhow best to work with the public.

WP5: Green Paper: all WPs will collaborate on drafting a green paper and seek input from the wider community though a consultancy period. Our international collaborator will provide a perspective external to the UK.

People

ORCID iD

Publications

10 25 50
 
Description Guidelines and Resources for AI Model Access from TrusTEd Researchenvironments (GRAIMatter)
Amount £315,488 (GBP)
Funding ID MC_PC_21033 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 01/2022 
End 08/2022
 
Description SATRE - Standardised Architecture for Trusted Research Environments
Amount £614,112 (GBP)
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 01/2023 
End 10/2023
 
Description TRE-FX: Delivering a federated network of TREs to enable safe analytics
Amount £562,457 (GBP)
Organisation United Kingdom Research and Innovation 
Sector Public
Country United Kingdom
Start 01/2023 
End 10/2023
 
Title Collection of tools and resources for managing the statistical disclosure control of trained machine learning models 
Description Tools for the Automatic Checking of Research Outputs 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Software to support TREs to check for disclosure of trained ML models 
URL https://github.com/ai-sdc
 
Description Building a legacy for UK health data research infrastructure (speaker) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Presentation on 'Alleviate: the Advanced Pain Discovery Platform (APDP) Data Hub' for the DIH Programme Showcase Event: Building a legacy for UK health data research infrastructure.
Year(s) Of Engagement Activity 2022
URL https://www.youtube.com/watch?v=DLRU_35dfYQ&list=PLBI5k9SgYrItfzjZ17c1b20GUp6V2wDRH&index=15
 
Description Cambridge Spark Lecture Series 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presentation: Overcoming the Challenges of Providing Access to Population Scale, Routinely Collected Health and Imaging Data for AI Development whilst Protecting Patient Confidentiality.
Year(s) Of Engagement Activity 2022
 
Description DARE UK Sprint Exemplar Event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Event focussed on DARE UK's Sprint Exemplar Projects, in this case: GRAIMATTER: Guidelines and Resources for Artificial Intelligence Model Access from Trusted Research Environments
Year(s) Of Engagement Activity 2022
 
Description GRAIMATTER Recommendations Workshop (organiser) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact GRAIMATTER Recommendations Workshop.
Year(s) Of Engagement Activity 2022,2023
 
Description HDR UK Multi-omics Cohorts Consortium NIP Insight Sharing Day 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Invited speaker to HDR UK Multi-omics Cohorts Consortium National Implementation Project Insight Sharing Day. Presentation: Experiences of developing a TRE to support multi-omic data within the AWS cloud.
Year(s) Of Engagement Activity 2023
 
Description HDR UK Technology Ecosystem Conference/Workshop (organiser and speaker) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact The purpose of the meeting was to kick off the Technology Ecosystem work as part of HDR UK 23-28 strategy, bringing together various overlapping initiatives to share knowledge and plan for how we will work together going forward.
Year(s) Of Engagement Activity 2023
 
Description Overcoming the Challenges of Providing Access to Population Scale, Routinely Collected Health and Imaging Data for AI Development whilst Protecting Patient Confidentiality 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Presentation to the AI summit in London covering the work of PICTURES, CO-CONNECT and GRAIMATTER
Year(s) Of Engagement Activity 2022
 
Description PPIE Workshops: GRAIMATTER 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Patients, carers and/or patient groups
Results and Impact PPIE 5 Workshops: GRAIMATTER - What is a TRE? What is ML? What are the GRAIMATTER project aims?
Year(s) Of Engagement Activity 2022
 
Description The AI Summit London (panel) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Expert panel speaker: The Landscape of AI Adoption in Medical Imaging - Challenges & Opportunities.
Year(s) Of Engagement Activity 2022
URL https://london.theaisummit.com/
 
Description UK DRI Informatics Scoping Event (speaker) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Invited speaker to UK Dementia Research Institute Informatics Scoping Event. Presentation on HDR UK's Technology Ecosystem Workstream.
Year(s) Of Engagement Activity 2023
 
Description UKRI DRI Community Congress (speaker) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact The UKRI Digital Research Infrastructure (DRI) Community Congress brought together stakeholders of the UKRI DRI strategy. Presentation: The power of DRI: A health data perspective.
Year(s) Of Engagement Activity 2023
URL https://web.cvent.com/event/fc0032b7-0b22-4dd0-8c4c-38f3155df75f/summary