Automated Clinical Epidemiology Studies (ACES) platform for complex epidemiology study designs and diverse databases

Lead Research Organisation: University of Birmingham
Department Name: Institute of Applied Health Research

Abstract

Routinely collected health care data are derived from electronic medical records, health insurance records and administration records in healthcare organisations. These databases are increasingly being used for research. They have been used to generate ideas about causes of illness, evaluation of health service policies, clinical audits and surveillance of diseases and looking for adverse effects of medications. Beyond these benefits routine databases are also useful to find out if the effects of drugs that are observed in Randomised Controlled Trials (RCT) are also observed in real world setting, especially in groups of people whose characteristics are different to those in the RCT studies.

Despite the benefits of routinely collected healthcare databases there are numerous challenges in utilising them for research. Some of the challenges are due to difficulty in extracting data in a way that allows complex study designs. Data extraction is expensive and tedious in terms of time, cost, effort and expertise. This is partly because the databases are huge in size, vary in structure and have wide range of data. Some of the difficulty in extraction is due to complexity of study designs needed to probe these databases, because the data was not collected for research purposes and therefore have numerous inherent biases. Furthermore any extraction needs clinical, epidemiological and technical expertise to interrogate these databases. These issues can lead to many human induced errors and can result in data that are not accurate and reproducible.

Working with computer scientists, clinicians and methodologists we have developed an Automated Clinical Epidemiology Studies (ACES) platform for extracting data that are accurate and reproducible for epidemiological studies in one database of medical records from general practices (The Health Improvement Network database). The platform enables to complete data extraction within minutes to hours which previously took weeks to months when done manually. The platform has already enabled numerous studies in the last 12 months.

Now that we have developed such a platform, in this research programme, we aim to extend this platform to; 1) complex epidemiological study designs and 2) databases that have different structure and coding systems.

For complex study designs we will develop and evaluate one platform for linked mothers and babies databases and another for studies of the effects of drugs (pharmaco-epidemiological studies). Pharmaco-epidemiological studies help with understanding the beneficial and harmful effects of medications. In the process of developing the automated platform for pharmaco-epidemiological studies we will also review and where necessary develop methodologies to estimate the effects of medications more accurately.

We have been in conversation with institutions in other countries to extend our ACES platform to their databases, which have different structure and coding systems, and evaluate if this works. If we achieve this then we could research multiple databases across different countries for one question simultaneously.

Finally we will also assess the risks of having such an automated data extraction system. For example, it is possible to conduct numerous studies within a day and only report ones that are showing positive results. We will identify such issues by discussing with relevant stakeholders and produce a set of recommendations on how best to avoid such situations.

Technical Summary

Work Package(WP)1: Development and validation of Automated Clinical Epidemiology Studies (ACES) software architecture for Automated Infant and Mothers Studies (AIMS) and Automated Pharmaco-Epidemiology Studies (APES)
We will develop algorithms to link mother and baby pairs utilising linked primary and secondary care data for AIMS and implement evidence based methodologies identified or developed in WP2 for APES. Functional and technical validation will be performed.

WP2: Precision methodologies for pharmaco-epidemiological studies
Three key biases in utilising routinely collected data (RCD) for pharmaco-epidemiological studies are: prescription by indication bias; immortality time bias; and not accounting for unobserved confounders. We will conduct systematic reviews to identify potential methodologies to reduce these biases and recommend where and in which circumstances these methodologies should be applied. Where there are gaps in evidence we will propose new methodologies to mitigate them.

WP3: Extend and evaluate our current ACES architecture to databases with differing nomenclature and structure
We aim to extend our ACES architecture to enable studies to be conducted in diverse countries that have databases with differing nomenclature and structures. We will do this by developing algorithms to normalise structure and then by applying Extract, Transform and Load (ETL) architecture to conduct studies seamlessly across multiple databases.

WP4: Identify and manage ethical issues of ACES
There are ethical challenges with ACES tools that need to be identified and systems put in place to mitigate them; e.g. multiple associations can be rapidly tested and a researcher could pursue only those with positive associations. We will identify such risks and potential solutions for them by literature review, semi-structured interviewing with data providers and through focus groups with researchers. We will then develop solutions with key stakeholders.

Publications

10 25 50

publication icon
Chandan J (2021) Nonsteroidal Antiinflammatory Drugs and Susceptibility to COVID-19 in Arthritis & Rheumatology

publication icon
Chandan JS (2019) Intimate partner violence and temporomandibular joint disorder. in Journal of dentistry

publication icon
Lai A (2021) An informatics consult approach for generating clinical evidence for treatment decisions in BMC Medical Informatics and Decision Making

publication icon
Lee SI (2022) Decreased renal function is associated with incident dementia: An IMRD-THIN retrospective cohort study in the UK. in Alzheimer's & dementia : the journal of the Alzheimer's Association

 
Description Public Health England utilises DExtER
Geographic Reach National 
Policy Influence Type Implementation circular/rapid advice/letter to e.g. Ministry of Health
 
Description INSIGHT Hub
Amount £3,400,000 (GBP)
Organisation Health Data Research UK 
Sector Private
Country United Kingdom
Start 11/2019 
End 08/2022
 
Description Improving testing for cardiometabolic diseases in women with previous gestational diabetes mellitus: an exemplar study on implementation and evaluation of a novel data driven randomised clinical trial platform in primary care
Amount £385,000 (GBP)
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 06/2022 
End 01/2025
 
Description Marshalling health system experience of 'patients like me' to guide treatment decisions: a UK demonstrator of the informatics consult
Amount £199,000 (GBP)
Organisation Health Data Research UK 
Sector Private
Country United Kingdom
Start 03/2020 
End 03/2021
 
Description Multimorbidity and Pregnancy: Determinants, Clusters, Consequences and Trajectories (MuM-PreDiCT)
Amount £2,948,688 (GBP)
Funding ID MR/W014432/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2021 
End 08/2024
 
Description OPTIMising therapies, discovering therapeutic targets and AI assisted clinical management for patients Living with complex multimorbidity (OPTIMAL study)
Amount £2,450,000 (GBP)
Funding ID NIHR202632_O 
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 08/2021 
End 08/2024
 
Description Therapies for Long COVID in non-hospitalised individuals: From symptoms, patient-reported outcomes and immunology to targeted therapies (The TLC Study)
Amount £2,200,000 (GBP)
Funding ID MC_PC_20050 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 03/2021 
End 02/2023
 
Title DextER Updated version 
Description DExtER is a software that was created before the fellowship. This forms the basis on which the whole new Automated Clinical Epidemiology Studies (ACES) platform will work. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact Needless to say the tool enable data extraction according to study designs within a couple of hours compared to weeks and months previously 
URL https://www.birmingham.ac.uk/research/activity/applied-health/research/health-informatics/Automated-...
 
Description ACES Global: Maastricht University 
Organisation Maastricht University (UM)
Country Netherlands 
Sector Academic/University 
PI Contribution We have established a collaboration with the Maastricht University primary care to implement and evaluate the tool. We successfully managed to do check the feasibility to implement the tool during our visit to Maastricht. We are now in the process of finalising the contract. Once the contract is done we will be able to jointly perform research
Collaborator Contribution They have made available their RNFM database for the evaluation.
Impact The key output has been the testing of the feasibility.
Start Year 2018
 
Description ACES for CALIBER 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution We are implementing the DExtER software to the CALIBER platform. This will enable seamless data extraction and produce analysable datasets within hours.
Collaborator Contribution They have provided access to the CALIBER platform.
Impact We are currently working jointly towards a grant on Informatics Consult
Start Year 2019
 
Description Cegedim 
Organisation Cegedim
Country France 
Sector Private 
PI Contribution Cegedim is an industry that provides THIN data and also the provider of the VISION EHR. We have provided license to them to use DExtER, our innovative tool for automated clinical epidemiology studies. We have also provided expertise on processing the THIN database
Collaborator Contribution Cegedim provided the THIN data for COVID19 related research.
Impact 4 Publications on COVID19 research
Start Year 2020
 
Description Helen Dolk,Prof of Epidemiology and Health Services Research 
Organisation Ulster University
Country United Kingdom 
Sector Academic/University 
PI Contribution Provide a wider team to access and identify the gaps and provide clinical support
Collaborator Contribution contributing to WP4, on polypharmacy, to identify suitable European data sources, help build research protocols for the future, and contribute to the design and conduct of an exemplar study using the EUROmediCAT database.
Impact NA
Start Year 2020
 
Description Ulster University 
Organisation Ulster University
Country United Kingdom 
Sector Academic/University 
PI Contribution We have analysed data to identify combination of medications that needs studied in multimorbid pregnancies.
Collaborator Contribution Key study designs for the euromedicat database
Impact We have added Ulster University as a collaborator
Start Year 2020
 
Description University College London 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution Project 1: LHS4NHS We have developed a prototype for a data driven learning health systems for hospital care in the National Health Service (LHS4NHS). The project expands on the existing success of the data extraction for epidemiological research (DExtER) tool developed by our team at the University of Birmingham. By adding on additional functionality such as automated study analytics and computable guideline we demonstrated the flow of data to knowledge and knowledge to practice in secondary care (Figure 1). Project 2: OPEHRRA We are currently in the process of developing OPEHRRA (OPen Electronic Health Record Research and Analytics platform) which is a novel platform designed for researchers and clinicians, by researchers and clinicians, to support them in managing their local, regional and national patient populations. The purpose of the platform will enable researchers and clinicians to engage all aspects of the 'open science' cycle including the open release of study design, data extraction and publication. They will have the opportunity to review the work of others in the network as well as request reviews for your own work. The ultimate goal is to make science transparent, reliable and accurate. OPEHRRA has several key elements to it including: 1. Open access protocol submission 2. Open protocol peer review comments and ethical approval 3. Open clinical code list generation and storage 4. Open data extraction 5. Open analytics 6. Open manuscript deposition and peer review
Collaborator Contribution They developed a framework for informatics consult
Impact The main output of LHS4NHS is a demonstrator of a data driven learning health system with inbuilt computable clinical guideline and an informatics consult tool for secondary care data. The example we demonstrated in the better care event was based on a case study for managing diabetes in hospitalised patients with COVID19. We are currently in the process of writing this up as a manuscript. The main output of OPEHRRA is an online demonstration of a portal for the research community to conduct 'open-science' using electronic health record data.
Start Year 2020