DARE: Creating the blueprint for a federated network of next generation, cross-council Trusted Research Environments.

Lead Research Organisation: University Hospitals Birmingham NHS Foundation Trust
Department Name: UNLISTED

Abstract

Solving society’s complex challenges requires experts working together, studying data collected for different purposes & from different sources & locations. However, combining data is challenging. There are public concerns about data security & access, especially for health data. Data governance (legal & ethical frameworks for data sharing) is critical. There are technical challenges in combining data collected in different “data languages” & in building secure computer networks which enable collaborative work, but protect privacy.

FED-NET builds on our operational system, providing a scalable solution to the technical & governance challenges of analysing datasets separated by geography & data language.

Working with patients, the public, analysts & clinicians, we have co-designed a secure way to combine sensitive health data with other data, working across 5 NHS hospitals. We have co-built a transparent governance process, ensuring data access is legal, with full public oversight.

We will scale our existing Trusted Research Environments (secure environments that ensure data privacy but enable large scale analytics) using “federated analytics” where the data stays put & the analysis moves.

We will test how different data languages can be translated into a common standard, focusing on data highly valued in research (laboratory science, meteorological data) but rarely available, using a study of asthma. We will test our governance solution, through public and expert workshops.

Technical Summary

Tackling societal challenges requires data & partnerships which span traditional funder silos. Data collected for specific purposes have distinct structures & ontologies. There are different common data models; none are comprehensive for cross-council research. Comprehensive datasets increase the risk of reidentification. Workshops with >400 lay members confirmed support for data access for public good, with data exposure limited to “where necessary” & “NHS proximity” as a gold standard.

FED-NET will test;
1. If data of differing modalities/languages can be combined using a standardised framework?
2. How open standards map diverse data for cross-council projects?
3. If a federated analytics model (including governance) can be deployed?
4. If this model serves analytical need & enhances public trust?

This DARE sprint will implement & test an innovative, scalable, industry-aligned Trusted Research Environment(TRE) & governance model which facilitates enhanced federated data discovery, focusing on a test case of asthma, including clinical, meteorological, pollution & translational data.

Councils served by the test case include MRC, EPSRC, InnovateUK and NERC.

Methods
The technical architecture is built & operational (HDR-UK PIONEER data haven/TRE). PIONEER’s tested governance model will be piloted across federated TREs, to determine scalability.

We will automate elements of the HDR-UK Five Safes, providing a metadata interchange, expanding equitable access to high-quality research data assets, reducing health inequalities.

Data solutions will be built around open standards including REST, HTTP, OMOP, & FHIR- UK, reducing proprietary/commercial constraints. Both NUH & UHB have experience in this. Research metadata will be queried following W3C international standards for data management & system interoperability.

We will adopt the Resource Description Framework(RDF) to support metadata exchange, using the query language SPARQL to facilitate express queries across diverse linked data sources. Scalability will enable basic statistical work to advanced machine learning. To allow contemporaneous metadata to be pulled or pushed, a secure standards-based RESTful API will be specified & implemented, allowing equitable access over the open HTTP protocol.

Data will be extracted to, staged in, & queried from an RDF-compatible meta-database preserving the original granularity, context, semantics, & encoding.

On request, the API will translate metadata to other populate research models such as OMOP or FHIR for enhanced onwards transportation & federation. Query results can be aggregated or used for statistical analysis, with results sent back to the client.

Data controller, analyst & public involvement events will assess if stakeholder and user-need is met with enhanced public trust.

Test case data assets are in hand, but in native language.

Impacts include:
• Blueprints & code templates for federated TRE networks.
• A map of limitations of common data models versus native language for diverse data assets.
• An understanding of more readily extensible data models than the current CDMs in widespread use.
• Production of deeply phenotyped cross-council research assets covering two large acute trusts and BRCs without direct exposure of sensitive data to researchers or transferring data between data controllers.
• The expansion of a publicly co-produced information governance framework.

Phase 2 test the wider scalability & commercial offer of this model.

Publications

10 25 50

publication icon
Atkin C (2022) The impact of changes in coding on mortality reports using the example of sepsis. in BMC medical informatics and decision making

 
Title Animation looking at the use of meta genomic diagnostic pathways in infections and how this can rationalise antibiotic use 
Description An animation co-created with members of the public 
Type Of Art Film/Video/Animation 
Year Produced 2024 
Impact Being used in public health campaigns locally and to explain concepts 
 
Title Your health data could save lives 
Description An animation, co-written with members of the public to highlight how health data can be used for research and what peoples choices are 
Type Of Art Film/Video/Animation 
Year Produced 2022 
Impact Very good feedback and wide usage 
 
Description MHRA consultation about safe medicines use
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact Move away from Valproate use in emergency medicine
 
Description Met with patient advisory group to discuss use of health data by industry
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact Build commercial model which is being tested nationally
 
Description NICE technology appraisal for remdesivir in COVID-19
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact Data used to discuss role for this treatment in COVID with guidelines now reflecting this expert testimonial
 
Description Workshop with 50 members of the stakeholder
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact Helped build protocol for NHSE SDE programme
 
Description Workshop with Members of the Public and Patients to
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact Survey conducted before and after event showed a change in attitudes and enhanced knowledge
 
Description Biomedical Research Centre. Infections in Acute Care
Amount £1,600,000 (GBP)
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 12/2022 
End 11/2027
 
Description Medicines in Acute Care Driver programme
Amount £5,000,000 (GBP)
Organisation Health Data Research UK 
Sector Private
Country United Kingdom
Start 03/2023 
End 03/2028
 
Description Patient Safety Reserach Centre Digital Clinical Support Tools in Acute Care
Amount £3,600,000 (GBP)
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 03/2023 
End 03/2028
 
Description Winter Pressures
Amount £75,000 (GBP)
Organisation Health Data Research UK 
Sector Private
Country United Kingdom
Start 01/2023 
End 03/2023
 
Title A blueprint for a TRE which meets international security standards 
Description This enables a TRE to be spun up in matter of hours, which is safe and secure 
Type Of Material Improvements to research infrastructure 
Year Produced 2023 
Provided To Others? Yes  
Impact We are just publishing this 
 
Title The West Midlands NHSE Phase 1 SDE based on PIONEER build 
Description PIONEER has formed the blue print for the NHSE West Midlands Secure Data Environment and the PIONEER protocol and learnings from federation have formed the protocol blueprint and commercial model. 
Type Of Material Improvements to research infrastructure 
Year Produced 2024 
Provided To Others? Yes  
Impact Standardised and secure data platform which meets ISO standards and a protocol which is freely available on request 
 
Title An NIHR Birmingham Biomedical Research Centre dataset of 21,581 intensive care admissions including demographic data, severity scores (APACHE, SAPS, SOFA) with investigations, serial physiology, treatments, and outcomes up to one year post admission. 
Description A highly granular dataset of 21,581 critical care admissions, curated by the NIHR Birmingham Biomedical Research Centre Infection and Acute Care Theme in collaboration with PIONEER. The data includes initial presentation, presenting symptoms, and several pre-calculated severity scoring systems including Simple Acute Physiology Score (SAPS), the Acute Physiology and Chronic Health Evaluation (APACHE) and the Sequential Organ Failure Assessment (SOFA) score. Data includes demography, serial physiology, ventilatory parameters, investigations, treatments (drug, dose, route), diagnostic codes (ICD-10 & SNOMED-CT) and outcomes, following patients for one year. This can be supplemented with imaging (results and images) and linked to ambulance conveyance and longer-term outcomes in the community. The current dataset includes admissions from 2017 to 2023 but can be expanded to assess other timelines of interest. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact Data access is available via the PIONEER Hub for projects which will benefit the public or patients. This can be by developing a new understanding of disease, by providing insights into how to improve care, or by developing new models, tools, treatments, or care processes. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact pioneer@uhb.nhs.uk or visit www.pioneerdatahub.co.uk for more details. Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can provide the dataset in OMOP and other common data models and can build synthetic data to meet bespoke requirements. Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and "off the shelf" Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and "fast screen" services to assess population size. 
URL https://web.www.healthdatagateway.org/dataset/ea03d4e1-73e8-4d84-b93a-a41febf73fb4
 
Title Hospitalised patients with diabetic emergencies & acute diabetic health concerns 
Description A dataset of 168,706 diabetic emergencies and acute admissions associated with diabetes-related health concerns, including demographic data with investigations, serial physiology and outcomes. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact All patients admitted to hospital from year 2002 and onwards, curated to focus on Diabetes. Longitudinal & individually linked, so that the preceding & subsequent health journey can be mapped & healthcare utilisation prior to & after admission understood. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to acute care process (timings, staff grades, specialty review, wards and triage). Along with presenting complaints, outpatients admissions, microbiology results, referrals, procedures, therapies, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations and others), all blood results(urea, albumin, platelets, white blood cells and others). Includes all prescribed & administered treatments and all outcomes. Linked images are also available (radiographs, CT scans, MRI). Available supplementary data: Matched controls; ambulance, OMOP data, synthetic data. Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, "fast screen" services. 
URL https://web.www.healthdatagateway.org/dataset/0d556d7e-be27-4979-a09e-d419b2e838f3
 
Title Synthetic data replicating 20,000 ethnically diverse hypertrophic cardiomyopathy patients: This includes clinical and biological phenotyping, co-morbidities, investigations (including ECG, ECHO), any procedures undertaken and outcomes. 
Description A PIONEER synthetic dataset of 20,000 ethnically diverse hypertrophic cardiomyopathy patients created using CT-GAN generative AI. Data includes clinical & biological phenotyping, co-morbidities, investigations (ECG, ECHO), procedures & outcomes. Well-created synthetic data establishes a governance risk-free environment for algorithm development & experimentation. This includes evaluating new treatment models, care management systems, clinical decision support, and more. Synthetic data is of particular use in rare diseases, where real data may be in short supply, or to replicate disease in less common patient demographics (e.g. ethnicities). Familial hypertrophic cardiomyopathy (HCM) is a rare genetic condition characterised by thickening (hypertrophy) of the cardiac muscle, usually of the interventricular septum. Arrhythmias can be life threatening and HCM is associated with an increased risk of sudden death. Some affected individuals develop potentially fatal heart failure, which may require heart transplantation. Approximately 130,000 people have HCM in the UK, but there is a significant burden of undiagnosed disease and diagnostic delay. 
Type Of Material Database/Collection of data 
Year Produced 2024 
Provided To Others? Yes  
Impact Data access is available via the PIONEER Hub for projects which will benefit the public or patients. This can be by developing a new understanding of disease, by providing insights into how to improve care, or by developing new models, tools, treatments, or care processes. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact pioneer@uhb.nhs.uk or visit www.pioneerdatahub.co.uk for more details. Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can provide the dataset in OMOP and other common data models and can provide real world data to meet bespoke requirements. Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and "off the shelf" Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and "fast screen" services to assess population size. 
URL https://www.pioneerdatahub.co.uk/wp-content/uploads/Patients-at-Risk-of-Sudden-Death-Hypertrophic-Ca...
 
Title Synthetic dataset of cross council data for asthma exacerbations, cytokines, air pollution and weather 
Description A synthetic dataset including data fields replicating an EHR, geographical location, air quality, IL-6 levels and ambient temperature above for > 20,000 records 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact Enabled deliver of FED-NET 
 
Description DARE Sprints 1b - DARE-FX: delivering a federated network of TREs to enable safe analytics 
Organisation University of Manchester
Country United Kingdom 
Sector Academic/University 
PI Contribution This is a new collaboration which has arisen due to the DARE sprint 1a work, seeking to expand on our work within the initial DARE sprint
Collaborator Contribution We are contributing technical expertise and synthetic data
Impact The project started 2 months ago -so too early for outputs as yet
Start Year 2023
 
Description DARE Sprints 1b - DARE-FX: delivering a federated network of TREs to enable safe analytics 
Organisation University of Nottingham
Country United Kingdom 
Sector Academic/University 
PI Contribution This is a new collaboration which has arisen due to the DARE sprint 1a work, seeking to expand on our work within the initial DARE sprint
Collaborator Contribution We are contributing technical expertise and synthetic data
Impact The project started 2 months ago -so too early for outputs as yet
Start Year 2023
 
Description Data and Enabling Technologies Group 
Organisation University of Leicester
Country United Kingdom 
Sector Academic/University 
PI Contribution The group are leading an initiative to construct a national medicines data map. Reflecting both national and international populations, this data map is set to become an invaluable asset for informing future medicines-related research. This expert working group is ongoing from 2023-2028
Collaborator Contribution The group is exploring the adoption of innovative technology developed by Leicester. The 'LeHMR' online platform which allows researchers to submit metadata about their datasets. Partners involved: University Hospitals Birmingham, Leicester University and University of Leeds. The group is expected to expand in 2024.
Impact Working ongoing
Start Year 2023
 
Description NIHR Patient Safety Research Collaboration Theme - Clinical Decision Support Tools 
Organisation University of Warwick
Country United Kingdom 
Sector Academic/University 
PI Contribution We will lead on building and testing of clinical decision support tools fir use in acute and emergency medicine
Collaborator Contribution They will help provide input into user acceptability
Impact Starting April 2023 - so no impacts as yet
Start Year 2023
 
Description • Winter Pressures NHSE Funding - Improving patient selection to same day emergency care 
Organisation University Hospitals Birmingham NHS Foundation Trust
Department Acute Medicine
Country United Kingdom 
Sector Hospitals 
PI Contribution We are running this project, funded by NHSE, to see if we can build better selection tools for SDEC care pathways - to reduce avoidable admissions to hospitals via acute medical units.
Collaborator Contribution N/A
Impact We have developed a patient facing leaflet about SDEC, held community workshops about our tool, and have developed two potential tools for further assessment.
Start Year 2023
 
Title Blueprint for NHSE West Midlands Secure Data Environment 
Description A blueprint for an SDE which can be sued by NHS organisations 
Type Support Tool - For Fundamental Research
Current Stage Of Development Initial development
Year Development Stage Completed 2023
Development Status Under active development/distribution
Impact Effective and efficient model which has been adopted widely 
 
Title TRE for federated analytics now being used widely 
Description A TRE which is currently used across a small number of Data Controllers 
Type Health and Social Care Services
Current Stage Of Development Refinement. Non-clinical
Year Development Stage Completed 2022
Development Status Under active development/distribution
Impact A cost effective, secure and deployable TRE 
 
Title Blue print for NHSE West Midlands SDE 
Description This is a blueprint for a cybersecurity tested SDE including data ingress and egress, data warehousing - tested and meeting ISO standards 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2024 
Impact Being used across West Midlands 
 
Title TRE for PIONEER for federated analytics 
Description This is a blueprint for a TRE 
Type Of Technology Software 
Year Produced 2022 
Impact Adopted across a number of Data controllers 
 
Description HDR UK Driver Programmes Priorities Meeting 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Study participants or study members
Results and Impact HDR UK convened this meeting to discuss workplans across the national driver programmes. Liz Sapey presented to the group on the Medicines in Acute and Chronic Care Driver Programme ambitions and workplan. This facilitated discussion around opportunities for integration across programmes and informed the group. There was also a deep dive into data and infrastructure priorities, discussion around access/ integration and support from HDR UK Pillars - e.g Trust and Transparency Capacity building plans.
Year(s) Of Engagement Activity 2023
 
Description Medicines in Acute and Chronic Care Driver Programme, Drug-Drug Interactions Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Study participants or study members
Results and Impact Workshop purpose: to facilitate dialogue on developing a Medicines in Acute and Chronic Care Programme policy on the standardisation of drug-drug interactions. This standardisation will serve as a unified way of working across the Programme and will also be extended to other HDR UK driver programmes. The workshop also provided the opportunity to evaluate existing drug-drug interaction resources and explore the feasibility of developing a dedicated resource for multi-way interactions or a gene interaction resource. Further discuss took place on the possibilities for owning and maintaining this type of resource and identify potential funding sources to support it. Munir Pirmohamed and Tjeerd Van Staa presented talks at this workshop on the above topics. The workshop took place on 27/09/2023
Year(s) Of Engagement Activity 2023
 
Description Medicines in Acute and Chronic Care Programme Meetings (Primary Care/Secondary Care, All Programme) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Study participants or study members
Results and Impact The Medicines in Acute and Chronic Care Programme hosts monthly and quarterly programme meetings to discuss the primary care and secondary care, medicines innovation workstream as well as all other workstreams within the programme. These meetings bring together the programme partners across 10 research organisations. The meeting provides the opportunity to report on updates, progress and encourages collaborative dialogue across the programme. Munir Pirmohamed and Liz Sapey primarily Chair and present at these meetings and the future direct of the programme is coordinated through these meetings.
Year(s) Of Engagement Activity 2023,2024
 
Description Stakeholder workshop of CIO, CMOs, form data controllers and data scientists 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact We held a series of workshops to discuss the implications of the Goldacre Review and how data egress could be prevented through the use of federated analytics and learning through TREs
Year(s) Of Engagement Activity 2022
 
Description Workshop with members of the public about their views on data egress versus federated approaches to consented health data 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Patients, carers and/or patient groups
Results and Impact A workshop and follow on series of working groups to agree on knowledge share and form a leaflet for members of the public to describe what federated analysis is, what its benefits and limitations are
Year(s) Of Engagement Activity 2022