Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMatter)

Lead Research Organisation: University of Dundee

Department Name: UNLISTED

Abstract

Trusted Research Environments (TREs) provide a secure location for researchers to analyse data for projects in the public interest e.g. providing information to SAGE to fight the COVID-19 pandemic. TRE staff check outputs to prevent disclosure of individuals’ confidential data.

TREs have historically supported only classical statistical data analysis. There is an increasing need to also facilitate the training of Artificial Intelligence (AI) models. AI has many valuable applications e.g., spotting human errors, streamlining processes, helping with repetitive tasks and supporting clinical decision making. The trained models then need to be exported from TREs for use. The size and complexity of AI models presents significant challenges for the disclosure-checking process. Models may be susceptible to external hacking: complicated methods to reverse engineer the learning process to find out about the data used for training, with more potential to lead to re-identification than conventional statistical methods.

With input from public representatives, GRAIMatter will assess a range of tools and methods to support TREs to assess output from AI methods for potentially identifiable information, investigate the legal and ethical implications and controls, and produce a set of guidelines and recommendations to support all TREs with export controls of AI algorithms.

Technical Summary

TREs are widely, and increasingly being used to support statistical analysis of sensitive data across a range of sectors (e.g., education, police, tax and health) as they enable secure and transparent research whilst protecting data confidentiality.

There is increasing desire from academia and industry to train AI models in TREs. The field of AI is developing quickly with applications including spotting human errors, streamlining processes, task automation and decision support. These more complex AI models require more information to describe and reproduce, increasing the possibility that sensitive information regarding secure data can be inferred from such descriptions. TREs do not have mature processes and controls against these risks. This is a complex topic, and it is unreasonable to expect all researchers to be aware of all risks or that TRE researchers have addressed these risks in AI-specific training.

We aim to address this problem by developing a set of usable recommendations for TREs to use to guard against the additional risks when disclosing trained AI models from TREs. We will draw upon our internationally recognised expertise in TREs, AI, data governance, disclosure control, data security and confidentiality, law and ethics.

WP1: Quantitative Assessment of Risk: a detailed empirical study to evaluate vulnerabilities of a selection of machine learning models. We will explore different models, hyper-parameter settings and training algorithms over common data types.

WP2: Controls and Evaluation of Tools: evaluation of effectiveness of a range of tools for addressing vulnerabilities identified in WP1 technically (do they accurately quantify disclosure risks) and organisationally (what is their impact on TRE output checking, and in assisting researchers to produce checkable ‘safe’ outputs).

WP3: Legal and Ethical Implications: a legal and ethical analysis of information from WP1/2 to develop a framework for regulation of AI models developed in a protected environment, and identification of aspects of existing legal and regulatory frameworks requiring reform to facilitate the use and export of AI models from protected environments.

WP4: PPIE: 4 workshops run by lay Co-Is seeking input from public/patients on our approach. We will produce lay summaries of project outputs and support and train our researcher team onhow best to work with the public.

WP5: Green Paper: all WPs will collaborate on drafting a green paper and seek input from the wider community though a consultancy period. Our international collaborator will provide a perspective external to the UK.

Total Expenditure April 2006 - March 2023:

£315,488

Funded Period:

Jan 22 - Aug 22

Funder:

MRC

Project Status:

Closed

Project Category:

Intramural

Project Reference:

MC_PC_21033

Principal Investigator:

Emily Jefferson

Health Category:

Unclassified

Organisations

University of Dundee (Lead Research Organisation)

People	ORCID iD
Emily Jefferson (Principal Investigator)	http://orcid.org/0000-0003-2992-7582
Antony Chuter (Co-Investigator)	http://orcid.org/0000-0002-0646-5939
Felix Ritchie (Co-Investigator)	http://orcid.org/0000-0003-4097-4021
Francesco Tava (Co-Investigator)
Simon Rogers (Co-Investigator)
James Smith (Co-Investigator)
Christian Cole (Co-Investigator)	http://orcid.org/0000-0002-2560-2484
Josep Domingo-Ferrer (Co-Investigator)
Jillian Anne Beggs (Co-Investigator)
Angela Daly (Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Caldwell J (2022) Scottish Medical Imaging Service - Technical and Governance controls. in International Journal of Population Data Science

Gao C (2022) A National Network of Safe Havens: Scottish Perspective. in Journal of medical Internet research

Jefferson E (2022) GRAIMATTER Green Paper: Recommendations for disclosure control of trained Machine Learning (ML) models from Trusted Research Environments (TREs)

Jefferson E (2022) Recommendations for disclosure control of trained Machine Learning (ML) models from Trusted Research Environments (TREs)

Jefferson E (2022) GRAIMATTER Green Paper: Recommendations for disclosure control of trained Machine Learning (ML) models from Trusted Research Environments (TREs)

Kavianpour S (2022) Next-Generation Capabilities in Trusted Research Environments: Interview Study. in Journal of medical Internet research

Kerasidou C (2023) Machine learning models, trusted research environments and UK health data: ensuring a safe and beneficial future for AI development in healthcare in Journal of Medical Ethics

Mansouri-Benssassi E (2023) Disclosure control of machine learning models from trusted research environments (TRE): New challenges and opportunities. in Heliyon

Ritchie F (2023) Machine learning models in trusted research environments -- understanding operational risks in International Journal of Population Data Science

Policy Influence
Further Funding
Software and Technical Products
Engagement Activities


Description	Application of AI SDC in Scotland
Geographic Reach	National
Policy Influence Type	Contribution to new or improved professional practice
Impact	Information governance and data teams managing access to patient data are now better informed regarding the potential additional risks posed by AI/ML.


Description	Industry access to public sector data: Review of current operational practice
Geographic Reach	National
Policy Influence Type	Citation in other policy documents
URL	https://www.researchdata.scot/news-and-insights/new-report-examines-safeguards-for-businesses-to-acc...


Description	Guidelines and Resources for AI Model Access from TrusTEd Research environments (GRAIMatter)
Amount	£315,488 (GBP)
Funding ID	MC_PC_21033
Organisation	Medical Research Council (MRC)
Sector	Public
Country	United Kingdom
Start	01/2022
End	08/2022


Description	SATRE - Standardised Architecture for Trusted Research Environments
Amount	£614,112 (GBP)
Organisation	United Kingdom Research and Innovation
Sector	Public
Country	United Kingdom
Start	01/2023
End	10/2023


Description	Semi-Automated Checking of Research Outputs (SACRO)
Amount	£637,821 (GBP)
Organisation	United Kingdom Research and Innovation
Sector	Public
Country	United Kingdom
Start	01/2023
End	10/2023


Description	TRE-FX: Delivering a federated network of TREs to enable safe analytics
Amount	£562,457 (GBP)
Organisation	United Kingdom Research and Innovation
Sector	Public
Country	United Kingdom
Start	01/2023
End	10/2023


Title	Collection of tools and resources for managing the statistical disclosure control of trained machine learning models
Description	Tools for the Automatic Checking of Research Outputs
Type Of Technology	Software
Year Produced	2022
Open Source License?	Yes
Impact	Software to support TREs to check for disclosure of trained ML models
URL	https://github.com/ai-sdc


Description	Building a legacy for UK health data research infrastructure (speaker)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Presentation on 'Alleviate: the Advanced Pain Discovery Platform (APDP) Data Hub' for the DIH Programme Showcase Event: Building a legacy for UK health data research infrastructure.
Year(s) Of Engagement Activity	2022
URL	https://www.youtube.com/watch?v=DLRU_35dfYQ&list=PLBI5k9SgYrItfzjZ17c1b20GUp6V2wDRH&index=15


Description	Cambridge Spark Lecture Series
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Industry/Business
Results and Impact	Presentation: Overcoming the Challenges of Providing Access to Population Scale, Routinely Collected Health and Imaging Data for AI Development whilst Protecting Patient Confidentiality.
Year(s) Of Engagement Activity	2022


Description	DARE UK Sprint Exemplar Event
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Event focussed on DARE UK's Sprint Exemplar Projects, in this case: GRAIMATTER: Guidelines and Resources for Artificial Intelligence Model Access from Trusted Research Environments
Year(s) Of Engagement Activity	2022


Description	ELSI Webinar on AI for Health
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Presented lecture on the ethical and privacy concerns around the use of AI/ML with health data within the context of a European project - HT-Advance. Initiated discussion around the project needs for data privacy and also better awareness of the issue for legal experts at INSERM, France.
Year(s) Of Engagement Activity	2023


Description	GRAIMATTER Recommendations Workshop (organiser)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	GRAIMATTER Recommendations Workshop.
Year(s) Of Engagement Activity	2022,2023


Description	HDR UK Multi-omics Cohorts Consortium NIP Insight Sharing Day
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Invited speaker to HDR UK Multi-omics Cohorts Consortium National Implementation Project Insight Sharing Day. Presentation: Experiences of developing a TRE to support multi-omic data within the AWS cloud.
Year(s) Of Engagement Activity	2023


Description	HDR UK Technology Ecosystem Conference/Workshop (organiser and speaker)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	The purpose of the meeting was to kick off the Technology Ecosystem work as part of HDR UK 23-28 strategy, bringing together various overlapping initiatives to share knowledge and plan for how we will work together going forward.
Year(s) Of Engagement Activity	2023


Description	HPC-AI Conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson (CTO, HDR UK and Interim Director of DARE UK) was an invited speaker at the 5th Annual HPC-AI Advisory Council UK Conference. Presentation: TREs at Scale.
Year(s) Of Engagement Activity	2023
URL	https://www.hpcwire.com/off-the-wire/5th-annual-hpc-ai-advisory-council-uk-conference-set-for-octobe...


Description	Invited speaker: HDR Technology Ecosystem and the Gateway: UKRI Data Infrastructure Club Show and Tell
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Emily Jefferson was an invited speaker, leading a presentation on the HDR UK Technology Ecosystem and the Gateway at the UKRI Data Infrastructure Club Show and Tell: 31st Jan 2023.
Year(s) Of Engagement Activity	2023


Description	Invited speaker: HDR Technology Ecosystem. UK DRI Informatics Scoping Event - London. 8th and 9th March 2023
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson, CTO of HDR UK was an invited speaker to lead a presentation on the HDR Technology Ecosystem at the UK DRI Informatics Scoping Event - London. 8th and 9th March 2023
Year(s) Of Engagement Activity	2023


Description	Invited speaker: Technology Ecosystem - Launch. Technology Ecosystem Conference/Workshop. Birmingham. 6th Feb 2023
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Technology Ecosystem Conference (6th February 2023) brought together different technology groups from across the community to strengthen relationships and generate ideas to deliver trustworthy infrastructure and services across the health data research ecosystem
Year(s) Of Engagement Activity	2023


Description	Invited speaker: The power of DRI: A health data perspective. UKRI Digital Research Infrastructure (DRI) Congress.
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson was an invited speaker to present on: The power of DRI: A health data perspective at the UKRI Digital Research Infrastructure (DRI) Congress. 6th and 7th March 2023.
Year(s) Of Engagement Activity	2023


Description	Japan Association for Medical Informatics Conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	Emily Jefferson (CTO, HDR UK and Interim Director of DARE UK) was a keynote speaker at the 43rd Joint Conference on Medical Informatics. Presentation: The UK's progress towards enabling secure, researcher access to sensitive health data at a UK population scale.
Year(s) Of Engagement Activity	2023
URL	https://confit-atlas-jp.translate.goog/guide/event/jcmi2023/session/3A11-13/detail?_x_tr_sl=ja&_x_tr...


Description	Keynote speaker: Towards Federated Analytics for Population Data. International Data Science Conference - Tokyo, Japan, 22/05/23
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson was invited as a keynote speaker to present 'Towards Federated Analytics for Population Data. International Data Science Conference - Tokyo, Japan' on 22/05/23
Year(s) Of Engagement Activity	2023


Description	Keynote speaker: Towards Federated Analytics for Population Data. Precision Medicine & Real-World Data Conference - Singapore, 23/05/23
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	Key note speaker, presented on experiences enabling a new UK infrastructure for finding and accessing population-wide data for research and public health analysis
Year(s) Of Engagement Activity	2023
URL	https://info.bcplatforms.com/precision-medicine-and-rwd-conference-singapore-2023


Description	Overcoming the Challenges of Providing Access to Population Scale, Routinely Collected Health and Imaging Data for AI Development whilst Protecting Patient Confidentiality
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Presentation to the AI summit in London covering the work of PICTURES, CO-CONNECT and GRAIMATTER
Year(s) Of Engagement Activity	2022


Description	PPIE Workshops: GRAIMATTER
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Patients, carers and/or patient groups
Results and Impact	PPIE 5 Workshops: GRAIMATTER - What is a TRE? What is ML? What are the GRAIMATTER project aims?
Year(s) Of Engagement Activity	2022


Description	Pistoia Alliance Christmas Lecture
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson (CTO, HDR UK and Interim Director of DARE UK) was a keynote speaker at Pistoia Alliance UK Life Science Informatics Forum. Presentation: Guidelines and Resources for Artificial Intelligence Model Access from Trusted Research Environments (GRAIMATTER).
Year(s) Of Engagement Activity	2023
URL	https://www.pistoiaalliance.org/events/


Description	Research Software Engineers (RSE) Conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson (CTO, HDR UK and Interim Director of DARE UK) was an invited speaker to the Seventh Annual Research Software Engineering Conference. Presentation: Can convening a Technology Ecosystem help TREs to work together?
Year(s) Of Engagement Activity	2023
URL	https://rsecon23.society-rse.org/


Description	The AI Summit London (panel)
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Expert panel speaker: The Landscape of AI Adoption in Medical Imaging - Challenges & Opportunities.
Year(s) Of Engagement Activity	2022
URL	https://london.theaisummit.com/


Description	UK DRI Informatics Scoping Event (speaker)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	Invited speaker to UK Dementia Research Institute Informatics Scoping Event. Presentation on HDR UK's Technology Ecosystem Workstream.
Year(s) Of Engagement Activity	2023


Description	UK TRE Community Meeting
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Emily Jefferson (CTO, HDR UK and Interim Director of DARE UK) was the keynote speaker at the UK TRE Community Meeting that was part of the RSE Conference. Presentation: Call to action!
Year(s) Of Engagement Activity	2023
URL	https://www.eventbrite.com/e/uk-tre-community-september-meeting-tickets-676066472017


Description	UKRI DRI Community Congress (speaker)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	The UKRI Digital Research Infrastructure (DRI) Community Congress brought together stakeholders of the UKRI DRI strategy. Presentation: The power of DRI: A health data perspective.
Year(s) Of Engagement Activity	2023
URL	https://web.cvent.com/event/fc0032b7-0b22-4dd0-8c4c-38f3155df75f/summary

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications