OpenSAFELY, ISARIC, PHOSP: tracking consequences of COVID-19 infection across UK primary and secondary care.

Lead Research Organisation: University of Oxford
Department Name: Primary Care Health Sciences

Abstract

While a large number of people who were diagnosed with COVID-19 have sadly died, there are still many people who survived having severe COVID. However, we know relatively little about people's experiences after this, in terms of how well they have recovered, and if they have additional healthcare needs when compared to similar people who did not have COVID-19. There are early reports of people experiencing "long COVID" - where the symptoms of COVID persist for many weeks, and also an increased likelihood of more serious outcomes like stroke and heart attacks.

Understanding the experience of people recovering from COVID-19 better will help people and doctors to better understand the risks that they face, and to make informed choices about treatment during recovery. It will also help people who make decisions about providing healthcare services to determine what kind of services are needed, how many people they might have to treat and for how long after people have COVID-19 they might be required. Ultimately this could help to improve the outcomes of patients - helping them to recover to a greater extent, and/or more quickly.

The OpenSAFELY project was set up at the beginning of the pandemic to provide urgent information on COVID-19. It contains information from the primary care (general practice) records of 40% of the English population. ISARIC contains detailed hospital records for people who were admitted to hospital with COVID-19. PHOSP contains data on symptoms and laboratory tests for people who were hospitalised with COVID-19 but were later discharged. These three data sources will be linked together to provide a much more powerful resource, where we can reliably determine what happens to COVID-19 patients after they are discharged from hospital.

Specifically, we will measure the occurrence in COVID-19 patients of many different diagnoses, symptoms and other healthcare activities like treatments and lab tests. We will compare these to how often they occur in similar patients who did not have COVID-19, and will determine what sort of things might influence how often they occur. For example, people who are older or those with previous medical conditions might be more likely to experience adverse outcomes during their recovery. We will describe in detail which patient outcomes are most likely to occur, which patients are most likely to get them and how long the risk lasts for. We will then measure the impact of the risks that we measure on healthcare services to help plan services in future.

Technical Summary

The prevalence and severity of health consequences for patients who have had COVID-19 are not currently known. There is also little data on which patients are most at risk of ongoing health and care needs. We must understand post-covid prognosis and risks to inform choices around prevention and treatment, design services, predict need, inform patients about their risks and prognosis, mitigate individuals' risks, and improve clinical outcomes.

We will link data and combine expertise from three key projects to provide an unprecedented, comprehensive, longitudinal, patient-level view on COVID-19 patients in England:

OpenSAFELY, running across patients' full primary and secondary care electronic health records (40% of patients in England, rising to 95% during the course of the project).

ISARIC, with detailed data on >80,000 COVID-19 patients' in-hospital presentation and management.

PHOSP, collecting bespoke symptom and laboratory data over 12 months on 10,000 hospitalised COVID-19 survivors.

Using OpenSAFELY EHR data linked to ISARIC and PHOSP cohort data we will:

- Assess risk of specific diagnoses, presentations, treatments, investigation findings, and symptoms that are elevated following COVID-19, in hospitalised and non-hospitalised patients.

- Evaluate the impact of age, ethnicity, prior medical history, COVID-19 disease severity and in-hospital treatment on variation in recovery and complications.

- Evaluate the extent to which PHOSP findings generalise to non-admitted COVID-19 patients.

- Describe the impact of "long COVID" on health service utilisation to help predict service design and need.

- Estimate excess morbidity associated with COVID-19.

- Develop tools to inform shielding policy based on long-term outcomes.

- Share open code resources.

Publications

10 25 50

 
Description Characterisation, determinants, mechanisms and consequences of the long-term effects of COVID-19: providing the evidence base for health care
Amount £9,592,626 (GBP)
Funding ID MC_PC_20051 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 03/2021 
End 02/2024
 
Description Phase 1 COVID-19 Data and Connectivity - National Core Study (Phase 1 D&C-NCS)
Amount £1,760,000 (GBP)
Funding ID MC_PC_20058 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 04/2021 
End 09/2022
 
Description Phase 1 COVID-19 Longitudinal Health and Wellbeing - National Core Study
Amount £9,000,000 (GBP)
Funding ID MC_PC_20059 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 04/2021 
End 09/2022
 
Title The OpenSAFELY platform 
Description OpenSAFELY is a productive, highly secure, transparent, non-commercial, fully open-source software platform specifically developed for analysis of NHS electronic health records data. It can be deployed in any data centre to create a TRE with unprecedented security, transparency, and efficiency. With these privacy protections the OpenSAFELY team has earned public trust to implement OpenSAFELY across an unprecedented volume of NHS patient data. OpenSAFELY-TPP and OpenSAFELY-EMIS: 58 million full GP records OpenSAFELY has been fully implemented inside the data centres of TPP and EMIS, the two largest GP EHR providers in England, under COPI. This delivers analyses across the full raw primary care records of 58 million patients. Data includes all diagnoses, blood tests, prescriptions, investigations, and referrals. In total it is over 60 billion rows of detailed NHS data. This GP data has never been accessible before at national scale, and is not available from any other source (GPES in NHSD and ONS is a smaller derivative GP dataset). Other routes to access GP data have struggled with privacy and delivery problems. OpenSAFELY has active positive support from the professions and Citizens Juries. OpenSAFELY also has active positive support from privacy campaigners who typically block access to GP data at this scale. This is only possible because of the new approaches to privacy preservation and transparency in the OpenSAFELY tools. Additional datasets are easily and regularly ingested and linked in OpenSAFELY A range of crucial datasets for NHS service monitoring and academic research have already been linked on, including: SUS (hospital admissions, outpatient visits); ECDS (coded A&E attendances); CPNS (death in hospital from Covid); SGSS (Covid test results); Covid vaccination data; ONS (cause of death in and out of hospital); Household data (other pseudonymised occupants, is it a care home, approx location); ICNARC (ICU data); ISARIC (detailed hospital records of hospitalised covid patients); ONS Covid Infection Survey; And more. How is OpenSAFELY funded? OpenSAFELY is a fully open and non-commercial project for public good, funded solely by public money through research grants from UKRI, Wellcome Trust, NIHR, and others. All code is freely available and open to all for review and re-use in any setting. Who runs OpenSAFELY? The core OpenSAFELY technical team are the DataLab at Oxford, consisting of high-end software developers who have been trained with deep knowledge of NHS EHR data; working alongside NHS data analysts and researchers with deep training in software development. They work in close collaboration with Users (including researchers from LSHTM and 18 other external organisations, including multiple NHS users); The developer and data teams at TPP and EMIS with deep EHR data expertise. NHS England which is the Data Controller for OpenSAFELY. What is OpenSAFELY technically? OpenSAFELY is not a conventional TRE where users have unfettered access to data on a remote desktop, as that approach would not currently be secure enough to earn public and professional trust for national scale of GP data access. It is a complete set of bespoke tools for NHS data curation, privacy preservation, and efficient pipelining of analyses across vast NHS datasets, with: Modular open source code, reducing duplication of effort on data curation, analysis, workflow, dashboards, federated analytics, and more Bespoke NHS data curation tools where users curate the data once, then can easily share standard curation code to all for review, re-use, and modification. Innovative privacy methods (analysts write code for data preparation, analysis and visualisation using bespoke dummy data created for them by the OpenSAFELY curation modules; this is then deployed against the real data, but users never need unfettered direct access to disclosive patient records) A complete public open log of all code executed on the platform, earning public and professional trust through complete transparency on all actions (though outputs can be kept closed for an interim duration (typically set by a governance group) to allow researchers to develop their work before finally being made public). A set of live NHS dashboard tools (with optional login controls for local NHS data, including new NHS smart card integration). Teams can make and share their own bespoke and re-usable code modules for their own analytic needs. Because code is portable between settings, analyses can be run in multiple settings easily, delivering "federated analytics" between settings. This "federated analytics" is done regularly between OpenSAFELY-TPP and OpenSAFELY-EMIS; and has been done with ICS data also. All OpenSAFELY tools are portable and can be implemented wherever NHS data resides. When OpenSAFELY is built in a new environment, all the OpenSAFELY code for data cleaning, curation, efficient analysis and dashboards can simply be lifted and shifted to execute in those new environments. OpenSAFELY is now deployed in one ICS, and the data centre containing data for another 12 ICSs (where access can be turned on with an IG signature); NHSD (fully performant against dummy data; live data after an IG signature); and in various scales of deployment in other data centres nationally and internationally. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? Yes  
Impact What OpenSAFELY has achieved NHS service monitoring and improvement OpenSAFELY can be used to monitor clinical activity and outcomes in all NHS data in all practices and regions in near-real-time. We are happy to give demonstrations of all these outputs. Examples of completed work include: PINCER: this is a national patient safety programme previously delivered through AHSNs by running manual searches in hundreds of practices to identify patients where there treatment may not be adherent to who have breached various safety recommendations, some of which are complex clinical pathways; we have re-implemented all of PINCER in OpenSAFELY to deliver all PINCER metrics for all practices in a single command, and provide dashboards across the whole nation. NHS Service Restoration Observatory: we have deployed code that monitors volume of activity and clinical outcomes for any aspect of GP service during COVID to identify outliers, with outputs in reports, papers, and updating dashboards. Vaccine coverage dashboards: we deployed vaccine coverage dashboards immediately in Dec 2020, with detailed coverage data in fine grained clinical (as well as demographic) subgroups. OpenSAFELY was the first to raise the need to address lower vaccine uptake in: ethnic minorities; learning disabilities; severe mental illness; carehome residents; and more. Outputs in papers and live updating dashboards. This work is now being enhanced through our NHS Primary Care and Medicines Analytics Unit joint with the NHS England primary care and medicines teams. Research Outputs OpenSAFELY delivered its first output just 6 weeks after project commencement. Since then it has delivered 24 published papers and 26 preprints in high impact journals including Nature, the Lancet, the BMJ and more alongside numerous reports to SAGE, JCVI, CMO, and CSA. Specific papers are on various topics (COVID only, due to COPI notice permissions) including: COVID-19 risk factors Vaccine effectiveness, safety, and coverage Long COVID prevalence and risk factors RCT follow-up in EHR data at very low cost COVID risks in ethnic groups, learning disabilities, HIV, others Health service utilisation post-COVID Health consequences of COVID infection and admission. An Efficient, High Throughput Platform OpenSAFELY has delivered a very large number of completed outputs because the OpenSAFELY working methods are more efficient. The OpenSAFELY tools have been built to scale, with full documentation and automated pipelines built around shared re-usable modular code, rather than "manual labour" for each single analysis. All code is shared as open, but also designed to be re-usable by default, so every new analysis benefits from all prior analyses. These working practices are a deliberate choice: The curation pipeline is standardised, and all curation code easily re-usable. Repeated actions are turned into re-usable modules and features. Everything has detailed technical documentation for users. Developers are trained in NHS data analysis; analysts are trained in software skills. Developers and analysts work hand-in-hand. Other Users The OpenSAFELY team is currently supporting 30 projects from 18 organisations during the pilot phase. It is ready and able to scale rapidly, and support efficient use of NHS data by many more researchers in OpenSAFELY-TPP, OpenSAFELY-EMIS, OpenSAFELY-ICS, and OpenSAFELY-NHSD. Federated analytics OpenSAFELY has already delivered federated analytics, where the exact same analysis is run automatically in different locations containing NHS data, despite their being wildly different computational and data environments. The platform is built so that OpenSAFELY code for data curation, analysis and dashboards can run in any data centre where OpenSAFELY is implemented. Federated Analytics is how all national analyses in OpenSAFELY are already done: code is written in OpenSAFELY; it then executes separately in OpenSAFELY-TPP and OpenSAFELY-EMIS; and the results are stitched together after completion. Portable Analytics OpenSAFELY is now running inside the Graphnet data centre containing 13 ICS's data, with IG approval for live data access already in one ICS: we have now completed our first end-to-end analysis running the same code in OpenSAFELY-TPP and OpenSAFELY-ICS to get comparable outputs. We can happily implement OpenSAFELY in any setting where users want portable curation and analysis. All code for dashboards, reports or research created in any OpenSAFELY environment move smoothly to work in any new OpenSAFELY setting. Efficient Curation and Cleaning of Difficult NHS EHR and GP data OpenSAFELY has built the tools and framework for a systematic approach to cleaning and curating all raw NHS data, converting it into usable variables alongside documentation, validity tests, and the code to instantly implement those curated variables in analyses. This efficient, shared, systematic approach - "curate once, share to all" - means fast outputs. Capacity Building OpenSAFELY has developed a substantial team perfectly tuned to the job of getting better outputs from NHS data: high end software developers trained in the specific challenges of NHS data, NHS analytics, and academic research; pooling skills and working alongside NHS analysts and researchers trained in software development. This core team is ready to expand and train others. The OpenSAFELY Co-Pilot Programme Users of OpenSAFELY currently need basic familiarity with standard data science tools including GitHub, Python, and Docker. Some analysts in academia and the NHS lack fluent data science skills. To accelerate outputs the team have taken two approaches: The OpenSAFELY Co-Pilot programme gives each new user 5 person-days of one-to-one training and support over their first 4 weeks to get their first analysis complete. This has been a huge success and built analytic capacity across England. The OpenSAFELY team is also building "point and click" tools so that NHS analysts can easily generate graphs, dashboards and reports on NHS GP activity and outcomes without needing any data science skills. Public Trust and Data Access OpenSAFELY has earned unprecedented trust from organisations that traditionally obstruct NHS data access, by listening to their concerns and developing new technical solutions for privacy and transparency that address those concerns. OpenSAFELY was the single most strongly supported NHS data initiative in a major Citizens Jury sponsored by the NDG and NHSX: the Jury's only concern was that OpenSAFELY might be closed after COPI ended! Art of the Possible OpenSAFELY has finally proved it is possible to "have your cake and eat it" with NHS data: to have massive access to massive volumes of data but also preserve patient privacy and patient trust. It has proved that the NHS can deliver large scale data analytics in broad collaboration at reasonable cost for both research and service improvement. 
URL http://opensafely.org/about
 
Title OpenSAFELY-TPP and OpenSAFELY-EMIS 
Description OpenSAFELY is a productive, highly secure, transparent, non-commercial, fully open-source software platform specifically developed for analysis of NHS electronic health records data. It can be deployed in any data centre to create a TRE with unprecedented security, transparency, and efficiency. With these privacy protections the OpenSAFELY team has earned public trust to implement OpenSAFELY across an unprecedented volume of NHS patient data. OpenSAFELY-TPP and OpenSAFELY-EMIS: 58 million full GP records OpenSAFELY has been fully implemented inside the data centres of TPP and EMIS, the two largest GP EHR providers in England, under COPI. This delivers analyses across the full raw primary care records of 58 million patients. Data includes all diagnoses, blood tests, prescriptions, investigations, and referrals. In total it is over 60 billion rows of detailed NHS data. This GP data has never been accessible before at national scale, and is not available from any other source (GPES in NHSD and ONS is a smaller derivative GP dataset). Other routes to access GP data have struggled with privacy and delivery problems. OpenSAFELY has active positive support from the professions and Citizens Juries. OpenSAFELY also has active positive support from privacy campaigners who typically block access to GP data at this scale. This is only possible because of the new approaches to privacy preservation and transparency in the OpenSAFELY tools. Additional datasets are easily and regularly ingested and linked in OpenSAFELY A range of crucial datasets for NHS service monitoring and academic research have already been linked on, including: SUS (hospital admissions, outpatient visits); ECDS (coded A&E attendances); CPNS (death in hospital from Covid); SGSS (Covid test results); Covid vaccination data; ONS (cause of death in and out of hospital); Household data (other pseudonymised occupants, is it a care home, approx location); ICNARC (ICU data); ISARIC (detailed hospital records of hospitalised covid patients); ONS Covid Infection Survey; And more. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact All OpenSAFELY outputs as per the rest of this RF submission. In addition there are currently 30 projects from 18 organisations being executed in OpenSAFELY. 
URL http://OpenSAFELY.org
 
Description OpenSAFELY in collaboration with LSHTM EHR team and EHR vendors EMIS and TPP 
Organisation EMIS Group
Country United Kingdom 
Sector Private 
PI Contribution OpenSAFELY
Collaborator Contribution We devised and coordinated the project and built the software and delivered substantial research within it. LSHTM collaborated closely on a wide range of research outputs. TPP collaborated closely on code, project design and vision, and delivered the database and backend.
Impact All of the OpenSAFELY outputs are by this collaboration.
Start Year 2020
 
Description OpenSAFELY in collaboration with LSHTM EHR team and EHR vendors EMIS and TPP 
Organisation London School of Hygiene and Tropical Medicine (LSHTM)
Department Faculty of Epidemiology and Population Health
Country United Kingdom 
Sector Academic/University 
PI Contribution OpenSAFELY
Collaborator Contribution We devised and coordinated the project and built the software and delivered substantial research within it. LSHTM collaborated closely on a wide range of research outputs. TPP collaborated closely on code, project design and vision, and delivered the database and backend.
Impact All of the OpenSAFELY outputs are by this collaboration.
Start Year 2020