Piloting A Secure, Scalable, Infrastructure for AI Dementia Research On Routinely Collected Data

Lead Research Organisation: University College London
Department Name: Computer Science

Abstract

Dementias affect over 55 million people worldwide and will exceed 135 million by 2050. This age-related global pandemic-in-waiting is an ideal application for AI to affect real-world change in how we manage these debilitating illnesses. The problem is that computational tools are typically developed and tested on bespoke research datasets that bear little to no resemblance to the data that is routinely collected in the NHS. This project aims to remove the primary roadblock to unlocking AI for real-world impact in the dementias: researcher access to routinely collected data from Memory Clinics, the front line in dementia healthcare.

The increasing availability of large medical research datasets has created myriad opportunities and examples of AI providing quantitative solutions targeting early diagnosis and accurate prognosis (Marinescu, MELBA 2021). However, such bespoke, high quality research data is rarely representative of routinely collected healthcare data. This fundamental disconnection between AI technology developers and the frontline of dementia healthcare is a key roadblock preventing real-world impact.

Our solution is to connect developers with routinely collected data. Memory clinics represent the frontline of dementia healthcare services and are the ideal setting for prototyping solutions.

We provide two solutions: one involves transferring anonymous data to the AI researchers, the other takes the AI algorithms directly to the data.

Technical Summary

AI approaches promise a revolution in healthcare diagnosis, prognosis, disease monitoring and predicting medication response. Indeed, and a wide range of approaches have been used in research (Pellegrini et al., 2018; Borchert et al., 2021). However, a gulf exists in the need for well curated 'clean' datasets in AI research, and the heterogeneous 'noisy' clinical and imaging datasets typical of healthcare. This fundamental disconnection between AI technology developers and the frontline of healthcare is a key roadblock preventing real-world impact. The solution is to connect developers with routinely collected data. Memory clinics represent the frontline of dementia healthcare services and are the ideal setting for prototyping solutions.

Broadly speaking there are two approaches to facilitating AI analysis for healthcare: centralised and decentralised. We prototype and test one of each: a DICOM router for anonymising and exporting data from PACS to a centralised analysis platform, and a federated learning system that takes the algorithms to the data, working inside the NHS in a decentralised manner.
 
Title Cloud FLIC: Federated Learning Infrastructure as Code in Virtual Private Clouds 
Description We are using Terraform to write infrastructure as code for cloud-based federated learning (FL) between hospitals. We use the open source FedBioMed FL software across virtual private clouds (VPCs) within each hospital. VPCs are hosted on Amazon Web Services (AWS), but the framework is flexible and other cloud service providers could be used. 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? No  
Impact We are breaking down barriers to research on routinely collected NHS healthcare data. Currently this is within and between our own teams: CODEC (NHS EPUT) and QMIN-MC (NHS CUH). 
URL https://github.com/ucl-codec