ISCF HDRUK DIH Sprint Exemplar: A scalable federated solution for UK health data encryption, linking and discovery

Lead Research Organisation: University of Leicester
Department Name: UNLISTED

Abstract

For healthcare and research purposes diverse patient datasets need to be linked and made easy to find. This requires a unified national approach, designed to protect patient privacy and yet operate in a combined centralised and multi-party (federated) manner. NHS digital is working with SME Privitar to encrypt and link centralised datasets gathered nationally and regionally (e.g., via LHCREs). To complement this, compatible methods need to also be applied to federated resources – both regional and themed. Resulting sets of linkable data on individual patients then need to be made easy to co-discover in technically and ethically secure ways, as a prelude to access and analysis. We propose a Midlands project to demonstrate such a solution – involving data owners, Privitar and other companies as system providers, along with HDRUK, TDCC, governance and other experts. We will establish a network of linkable and co-discoverable patientrelated datasets, anchored on a substantial set of ‘LLR’ GP records (Leicester, Leicestershire, Rutland). The network will also span other regional primary care data, secondary care records, and biobank/cohort data. A case study in multimorbidity will inform and test the system's detailed design, and a multi-disciplinary oversight committee will ensure scalability and national relevance.

Technical Summary

The UK’s health data ecosystem needs to link datasets on individuals, via a universal technology plus operational policies, so that record de-identification, encryption and linking can support data discovery, sharing, integration and analysis models across a continuum of federated-centralised models. NHS Digital is working with Privitar to provide such capabilities in primarily centralised arrangements. We will complement their approach by implementing Privitar’s technologies across the Midlands to also leverage advantages (immediacy, customisation, deep discoverability, local control) of a more federated architecture. We plan four activities:
1) Encryption and linking - upon the substantial ‘LLR’ GP dataset using Privitar's technology operated by NHS Leicestershire Health Informatics Service (LHIS), and also upon exemplars from other primary care, secondary care, biobank and cohort datasets to demonstrate scalability and linkability;
2) Data discovery - by deploying mature software (Café Variome) on top of linked searchable facets from these datasets in a secure federated manner;
3) Utility case study – on multimorbidity to iteratively specify and assess system utility;
4) Scalability Planning – via a strategic ‘scalability’ committee (PPI, HDRUK, NHS Digital, governance, regional Trusts, industry, researchers) to monitor progress and define a nationally optimised route for scaling up in depth, breadth and functionality.

People

ORCID iD

 
Description 2022 - FHD Implementation Study
Amount £26,996 (GBP)
Funding ID RM38G0268 
Organisation Earlham Institute 
Sector Academic/University
Country United Kingdom
Start 01/2022 
End 12/2023
 
Description ISCF HDRUK DIH Sprint Exemplar: A scalable federated solution for UK health data encryption, linking and discovery
Amount £269,000 (GBP)
Funding ID MC_PC_18031 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 02/2019 
End 11/2019
 
Description UK-2022-BEACON
Amount £45,231 (GBP)
Funding ID RM38G0269 
Organisation Earlham Institute 
Sector Academic/University
Country United Kingdom
Start 01/2022 
End 12/2023
 
Title Beacon-v2 
Description Coleading development (under GA4GH) of the Beacon-v2 standard API for expressing and responding to discovery queries regarding biomedical research assets 
Type Of Material Computer model/algorithm 
Year Produced 2022 
Provided To Others? Yes  
Impact Wide adoption now underway in myriad national and international settings. Already transforming discovery capabilities, and creating unprecedented interoperability in tis domain. 
URL https://docs.genomebeacons.org/what-is-beacon-v2/
 
Title LeHMR 
Description Leicester Health and Medical Data for Research (LeHMR) is software that enables rapid creation of a 'local Gateway' for biomedical dataset listing and discovery, fully compatible with the federation capabilities of the national HDR-UK Gateway. It supports federated querying and metadata syndication. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Impact It has been adopted by the University of Leicester and Kidney research UK, and is being considered by others. It has been offered to HDR-UK as a general solution to help grow their planned Gateway federation network.