ISCF HDRUK DIH Sprint Exemplar: A scalable federated solution for UK health data encryption, linking and discovery
Lead Research Organisation:
University of Leicester
Department Name: UNLISTED
Abstract
For healthcare and research purposes diverse patient datasets need to be linked and made easy to find. This requires a unified national approach, designed to protect patient privacy and yet operate in a combined centralised and multi-party (federated) manner. NHS digital is working with SME Privitar to encrypt and link centralised datasets gathered nationally and regionally (e.g., via LHCREs). To complement this, compatible methods need to also be applied to federated resources – both regional and themed. Resulting sets of linkable data on individual patients then need to be made easy to co-discover in technically and ethically secure ways, as a prelude to access and analysis. We propose a Midlands project to demonstrate such a solution – involving data owners, Privitar and other companies as system providers, along with HDRUK, TDCC, governance and other experts. We will establish a network of linkable and co-discoverable patientrelated datasets, anchored on a substantial set of ‘LLR’ GP records (Leicester, Leicestershire, Rutland). The network will also span other regional primary care data, secondary care records, and biobank/cohort data. A case study in multimorbidity will inform and test the system's detailed design, and a multi-disciplinary oversight committee will ensure scalability and national relevance.
Technical Summary
The UK’s health data ecosystem needs to link datasets on individuals, via a universal technology plus operational policies, so that record de-identification, encryption and linking can support data discovery, sharing, integration and analysis models across a continuum of federated-centralised models. NHS Digital is working with Privitar to provide such capabilities in primarily centralised arrangements. We will complement their approach by implementing Privitar’s technologies across the Midlands to also leverage advantages (immediacy, customisation, deep discoverability, local control) of a more federated architecture. We plan four activities:
1) Encryption and linking - upon the substantial ‘LLR’ GP dataset using Privitar's technology operated by NHS Leicestershire Health Informatics Service (LHIS), and also upon exemplars from other primary care, secondary care, biobank and cohort datasets to demonstrate scalability and linkability;
2) Data discovery - by deploying mature software (Café Variome) on top of linked searchable facets from these datasets in a secure federated manner;
3) Utility case study – on multimorbidity to iteratively specify and assess system utility;
4) Scalability Planning – via a strategic ‘scalability’ committee (PPI, HDRUK, NHS Digital, governance, regional Trusts, industry, researchers) to monitor progress and define a nationally optimised route for scaling up in depth, breadth and functionality.
1) Encryption and linking - upon the substantial ‘LLR’ GP dataset using Privitar's technology operated by NHS Leicestershire Health Informatics Service (LHIS), and also upon exemplars from other primary care, secondary care, biobank and cohort datasets to demonstrate scalability and linkability;
2) Data discovery - by deploying mature software (Café Variome) on top of linked searchable facets from these datasets in a secure federated manner;
3) Utility case study – on multimorbidity to iteratively specify and assess system utility;
4) Scalability Planning – via a strategic ‘scalability’ committee (PPI, HDRUK, NHS Digital, governance, regional Trusts, industry, researchers) to monitor progress and define a nationally optimised route for scaling up in depth, breadth and functionality.
Organisations
Publications
Denommé-Pichon AS
(2023)
A Solve-RD ClinVar-based reanalysis of 1522 index cases from ERN-ITHACA reveals common pitfalls and misinterpretations in exome sequencing.
in Genetics in medicine : official journal of the American College of Medical Genetics
Fagbamigbe A
(2022)
Clustering Long-Term Health Conditions Among 67728 People with Multimorbidity Using Electronic Health Records in Scotland
in SSRN Electronic Journal
Rambla J
(2022)
Beacon v2 and Beacon networks: A "lingua franca" for federated data discovery in biomedical genomics, and beyond.
in Human mutation
Yaldiz B
(2023)
Twist exome capture allows for lower average sequence coverage in clinical exome sequencing.
in Human genomics
Description | 2022 - FHD Implementation Study |
Amount | £26,996 (GBP) |
Funding ID | RM38G0268 |
Organisation | Earlham Institute |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2022 |
End | 12/2023 |
Description | ISCF HDRUK DIH Sprint Exemplar: A scalable federated solution for UK health data encryption, linking and discovery |
Amount | £269,000 (GBP) |
Funding ID | MC_PC_18031 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2019 |
End | 11/2020 |
Description | UK-2022-BEACON |
Amount | £45,231 (GBP) |
Funding ID | RM38G0269 |
Organisation | Earlham Institute |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2022 |
End | 12/2023 |
Title | Beacon-v2 |
Description | Coleading development (under GA4GH) of the Beacon-v2 standard API for expressing and responding to discovery queries regarding biomedical research assets |
Type Of Material | Computer model/algorithm |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | Wide adoption now underway in myriad national and international settings. Already transforming discovery capabilities, and creating unprecedented interoperability in tis domain. |
URL | https://docs.genomebeacons.org/what-is-beacon-v2/ |
Title | LeHMR |
Description | Leicester Health and Medical Data for Research (LeHMR) is software that enables rapid creation of a 'local Gateway' for biomedical dataset listing and discovery, fully compatible with the federation capabilities of the national HDR-UK Gateway. It supports federated querying and metadata syndication. |
Type Of Technology | Webtool/Application |
Year Produced | 2022 |
Impact | It has been adopted by the University of Leicester and Kidney research UK, and is being considered by others. It has been offered to HDR-UK as a general solution to help grow their planned Gateway federation network. |