Ensuring test evaluation research is applicable in practice: investigating the effects of routine data on the validity of test accuracy meta-analyses

Lead Research Organisation: University of Birmingham

Department Name: Health and Population Sciences

Abstract

Diagnosis is a difficult process. A patient who presents to their doctor ill will often undergo a process which involves being asked questions, observed, examined and perhaps even having blood or imaging 'tests'. Each question asked or observation made is either a diagnostic test in its own right or part of one and is a necessary part of arriving at a diagnosis.

But some tests are better than others and importantly probably no test is 100% accurate. Sometimes a test result may suggest a patient has normal health when they actually have disease or have disease when they have normal health. This happens to all tests and diagnostic test accuracy research is aimed at evaluating how often this happens, in other words, determining how accurate tests are.

Essentially when a clinician decides upon a diagnosis they are consciously or otherwise invoking a probabilistic process where multiple tests are combined and the patient's diagnosis should be the one most probable given the combination of all the test results. However, for this process to be truly beneficial to the patient the clinician needs to know the accuracy of each of these tests and how likely the patient has disease before the diagnostic process has even started.

This is where the difficulty lies for those who practise evidence-based medicine. Although the accuracy of many tests has been estimated by research studies, for individual tests the accuracy may vary significantly between studies. This variation may depend on who is applying the test, how it is being applied, which patient it is being applied to and most significantly of all, how the accuracy was measured in the study. When there are several studies there are methods which allow us to combine their results. These methods may also help determine the real reasons why the test's accuracy varies. However, in general, the studies report insufficient data of sufficient quality to enable such analyses to be either possible or comprehensive.

Furthermore, from previous work, we have been able to demonstrate that in some cases the test accuracy reported by a study may be virtually impossible in some patient settings. This creates a problem for the doctor. How do they know which estimate of a test's accuracy to use if it varies greatly between studies and risks being nearly impossible for their own practice?

We have already begun to develop methods which make it possible to determine whether results from a test study are likely to accurately represent a doctor's practice in general. This would mean that a doctor could confidently apply the research to their own practice without reservation. However, sometimes the research is not reflective of the different clinical settings seen in practice and a more specific solution is required. This may be done by collecting routine data from the doctor's own setting and using it to determine a feasible range of values for the test's accuracy. This method, in its current form, is used to exclude the studies 'least likely' to derive a plausible estimate of a test's accuracy for the doctor in their own practice.

At the moment both methods are in development but potentially could be implemented into the real-world and used to improve diagnosis. There are clear patient benefits to improving diagnostic performance including reducing the number of patients treated unnecessarily and increasing the number treated appropriately. One of the aims of this research is to pilot integrating this method into General Practice to help diagnose infection. This could also help reduce the potential for antibiotic resistance by reducing the number of antibiotics prescribed inappropriately.

However, before this is done the methods need to be fully investigated to determine their utility and limitations. It may be that other approaches afford greater patient benefit, and an evaluation of these with the methods already described, will be the focus of the proposed research.

Technical Summary

Aims
Diagnostic test accuracy (DTA) research is not always informative in practice. Primary DTA studies and meta-analyses may produce results that do not transfer to practice. Recently we developed methods that evaluate the validity of DTA meta-analyses and incorporate routine data to generate a tailored meta-analysis (TMA) estimate that is specific to the practice. Both of these need further development and evaluation.

Objectives
1. To synthesise a database for use in other work streams and review constrained models
2. To develop models which incorporate constrained data
3. To investigate methods which determine the validity of DTA review's results
4. To explore integrating the TMA model into UK General Practice

Methodology
There will be 4 overlapping work streams (WS)

WS 1
We will construct a database of different cases of DTA meta-analyses with associated routine practice data collected on the test to provide input data to WS 2-4. We will also review methods that explore modelling constrained data

WS 2
This will develop other models that include all studies but weight the less probable studies accordingly. We will investigate both a constrained maximum likelihood and a Bayesian approach. In addition, we will use covariate modelling in meta-regression analyses to explore causes of heterogeneity

WS3
The validity of meta-analysis results may be determined by combining a cross validation procedure with an appropriate method for comparing primary and secondary study estimates. We have developed one method which predicts where a new study is likely to lie, but other approaches are possible and these will be investigated

WS 4
The TMA model for test evaluations will be integrated into general practice so GPs may apply it to their patients. Templates will be designed within the electronic records system (ERS) to collect routine data. The TMA model will be developed as a web-application to access from the ERS and will be tested in 6 practices

Planned Impact

This proposed research will produce a range of outputs that will be of interest to several different parties. In the early stages of the project, part of the communication plan will be raising the awareness of the difficulties of implementing diagnostic test accuracy (DTA) research. This will extend on our previous research into methods aimed at determining the validity and applicability of DTA studies in practice. It will also involve challenging the current orthodoxy of evidence synthesis methods in diagnostic research. As a result the outputs will inform both methodologists and clinicians.

Towards the end of the project new methods will have been developed that enable reviewers involved in the process of constructing a DTA meta-analysis to evaluate the statistical validity of the estimates produced. Moreover, the research will produce models that may synthesise estimates when the accuracy is known to be constrained in the values it may take. In this instance, the combining of evidence from the research literature with data from the practice of interest has implications for the type of statistical modelling that may be used. Consequently, the models per se will be of interest to the statisticians and methodologists in the diagnostic research community. Furthermore, constrained models are not widely known to medical statisticians so it is anticipated that the dissemination of this work is likely to yield further applications outside of diagnostic research in the future.

Whilst a large element of the research is focussed on methodological and model development the overarching theme is to enhance the decision-making on the applicability of test accuracy research in practice. Accordingly the methods will be applied to a number of diagnostic and screening tests used in clinical practice. In particular, tests used in the NHS national screening programmes such as the Nucleic acid amplification tests used to screen for Chlamydia will be evaluated and the results fed back to the respective screening committees. The outputs will be pertinent to the NHS and could potentially play a role in the future organisation of screening services.

Furthermore, one of the work streams will integrate a working model into the electronic records systems of 6 general practice surgeries so it may be used as a diagnostic decision support tool that is tailored to the patients in each practice. Overall the research will be relevant to both clinicians working at the sharp end of health care delivery and policy decision-makers who plan and implement service provision.

Patients will be the ultimate beneficiary of this research. In achieving the aim of improving decisions on when to apply test evaluation research in practice there is the potential to improve decision making on the treatment and management of patients on both the small and large scale.

The early outputs are expected to emerge in the first two years of work. Many of the outputs are likely to have an impact within 5 years of the project commencing. Thus the main model and methodological development in test accuracy research is expected to be completed within the term of the project (4 years) and disseminated within five. However, it is expected that the models will find application in other fields and although this will widen the impact it is also likely to take much longer.

Funded Value:

£870,356

Funded Period:

Sep 16 - Aug 21

Funder:

MRC

Project Status:

Closed

Project Category:

Fellowship

Project Reference:

MR/N007999/1

Principal Investigator:

Brian Harvey Willis

Health Category:

Unclassified

Organisations

People	ORCID iD
Brian Harvey Willis (Principal Investigator / Fellow)

Publications

Author Name

Title Publication Date Published

10 25 50

Baragilly M (2022) On estimating a constrained bivariate random effects model for meta-analysis of test accuracy studies in Statistical Methods in Medical Research

Baragilly M (2022) Optimising a coordinate ascent algorithm for the meta-analysis of test accuracy studies

Baragilly M (2023) Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach in Journal of Probability and Statistics

Baragilly M (2022) Clustering functional data using forward search based on functional spatial ranks with medical applications. in Statistical methods in medical research

Chandan JS (2018) The association between idiopathic thrombocytopenic purpura and cardiovascular disease: a retrospective cohort study. in Journal of thrombosis and haemostasis : JTH

Cohen JF (2021) Preferred reporting items for journal and conference abstracts of systematic reviews and meta-analyses of diagnostic test accuracy studies (PRISMA-DTA for Abstracts): checklist, explanation, and elaboration. in BMJ (Clinical research ed.)

Dafoulas GE (2017) Type 1 diabetes mellitus and risk of incident epilepsy: a population-based, open-cohort study. in Diabetologia

Finnikin S (2021) Factors predicting statin prescribing for primary prevention: a historical cohort study. in The British journal of general practice : the journal of the Royal College of General Practitioners

Foley KG (2022) Risk of developing gallbladder cancer in patients with gallbladder polyps detected on transabdominal ultrasound: a systematic review and meta-analysis. in The British journal of radiology

Freeman K (2021) Test accuracy of faecal calprotectin for inflammatory bowel disease in UK primary care: a retrospective cohort study of the IMRD-UK data. in BMJ open

Freeman K (2019) Faecal calprotectin to detect inflammatory bowel disease: a systematic review and exploratory meta-analysis of test accuracy. in BMJ open

Freeman K (2022) Comparing outcomes from tailored meta-analysis with outcomes from a setting specific test accuracy study using routine data of faecal calprotectin testing for inflammatory bowel disease. in BMC medical research methodology

Freeman K (2021) The incidence and prevalence of inflammatory bowel disease in UK primary care: a retrospective cohort study of the IQVIA Medical Research Database. in BMC gastroenterology

Freeman K (2021) Faecal calprotectin testing in UK general practice: a retrospective cohort study using The Health Improvement Network database. in The British journal of general practice : the journal of the Royal College of General Practitioners

Gabr H (2022) Measuring and exploring mental health determinants: a closer look at co-residents' effect using a multilevel structural equations model. in BMC medical research methodology

McInnes MDF (2018) Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement. in JAMA

Mittal A (2019) Cancer as a risk factor for urinary tract calculi: a retrospective cohort study using 'The Health Improvement Network' Cancer and urinary tract calculi in Urolithiasis

Salameh JP (2020) Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): explanation, elaboration, and checklist. in BMJ (Clinical research ed.)

Schwendicke F (2018) Visual and radiographic caries detection: a tailored meta-analysis for two different settings, Egypt and Germany. in BMC oral health

Toulis KA (2017) All-Cause Mortality in Patients With Diabetes Under Treatment With Dapagliflozin: A Population-Based, Open-Cohort Study in The Health Improvement Network Database. in The Journal of clinical endocrinology and metabolism

Toulis KA (2017) All-cause mortality in patients with diabetes under glucagon-like peptide-1 agonists: A population-based, open cohort study. in Diabetes & metabolism

Willis BH (2019) Tailored meta-analysis: an investigation of the correlation between the test positive rate and prevalence. in Journal of clinical epidemiology

Willis BH (2020) Clinical scores in primary care. in The British journal of general practice : the journal of the Royal College of General Practitioners

Willis BH (2020) Comparison of Centor and McIsaac scores in primary care: a meta-analysis over multiple thresholds. in The British journal of general practice : the journal of the Royal College of General Practitioners

Willis BH (2020) Maximum likelihood estimation based on Newton-Raphson iteration for the bivariate random effects model in test accuracy meta-analysis. in Statistical methods in medical research

Willis BH (2017) Measuring the statistical validity of summary meta-analysis and meta-regression results for use in clinical practice. in Statistics in medicine

Yip K (2017) 62: A retrospective multicentre audit of outcome among patients with anaplastic lymphoma kinase (ALK) gene rearrangement positive non-small cell lung cancer (NSCLC) who have been treated with crizotinib in England in Lung Cancer

Šumilo D (2019) Long-term impact of giving antibiotics before skin incision versus after cord clamping on children born by caesarean section: protocol for a longitudinal study based on UK electronic health records. in BMJ open

Šumilo D (2022) Long term impact of prophylactic antibiotic use before incision versus after cord clamping on children born by caesarean section: longitudinal study of UK electronic health records. in BMJ (Clinical research ed.)

Šumilo D (2022) Long-term impact of pre-incision antibiotics on children born by caesarean section: a longitudinal study based on UK electronic health records. in Health technology assessment (Winchester, England)

Policy Influence
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Software and Technical Products
Engagement Activities


Description	Involved in a guideline production for the meta-analyses of test accuracy studies
Geographic Reach	Multiple continents/international
Policy Influence Type	Membership of a guideline committee
URL	https://jamanetwork.com/journals/jama/fullarticle/2670259


Description	MRC Clinician Scientist fellowship
Amount	£864,337 (GBP)
Funding ID	MR/N007999/1
Organisation	Medical Research Council (MRC)
Sector	Public
Country	United Kingdom
Start	09/2016
End	08/2020


Description	NIHR Evaluation, Trials and Studies Coordinating Centre (NETSCC)
Amount	£405,646 (GBP)
Funding ID	16/150/01
Organisation	NIHR Evaluation, Trials and Studies Coordinating Centre (NETSCC)
Sector	Public
Country	United Kingdom
Start	02/2018
End	01/2020


Title	A constrained model for bivariate meta-analysis
Description	Tailored meta-analysis uses setting-specific knowledge for the test positive rate and disease prevalence to constrain the possible values for a test's sensitivity and specificity. The constrained region is used to select those studies relevant to the setting for meta-analysis using an unconstrained bivariate random effects model (BRM). However, sometimes there may be no studies to aggregate, or the summary estimate may lie outside the plausible or "applicable" region. Potentially these shortcomings may be overcome by incorporating the constraints in the BRM to produce a constrained model. Using a penalised likelihood approach we have developed an optimisation algorithm based on co-ordinate ascent and Newton-Raphson iteration to fit a constrained bivariate random effects model (CBRM) for meta-analysis. When combining setting-specific data with test accuracy meta-analysis,a constrained model is more likely to yield a plausible estimate for the sensitivity and specificity in the practice setting than an unconstrained model. Using numerical examples based on simulation studies and real datasets we compared its performance with the BRM in terms of bias, meansquared error and coverage probability. We also determined the 'closeness' of the estimates to their true values using the Euclidian and Mahalanobis distances. The CBRM produced estimates which in the majority of cases had lower absolute mean bias and greater coverage probability than the BRM. The estimated sensitivities and specificity for the CBRM were, in general, closer to the true values than the BRM. For the two real datasets, the CBRM produced estimates which were in the applicable region in contrast to the BRM.
Type Of Material	Improvements to research infrastructure
Year Produced	2022
Provided To Others?	Yes
Impact	A publication has already resulted it has also highlighted the deficiencies of meta-analysis of test accuracy studies without setting specific data. The constrained model will overcome this and should be used by future meta-analysts
URL	https://journals.sagepub.com/doi/full/10.1177/09622802211065157


Title	NAAT 2012 reported_Data analysis technique
Description	The research material consists of data from primary studies used in meta-analysis. Occasionally patient audit data from my own practice is used
Type Of Material	Model of mechanisms or symptoms - human
Provided To Others?	No
Impact	A paper has been accepted for publication and others should follow The work is on-going


Title	Optimisation algorithm for meta-analysis model
Description	Development of a new optimisation algorithm for the conducted of meta-analyses of test accuracy studies
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	Too early to say
URL	https://journals.sagepub.com/doi/10.1177/0962280219853602


Title	The effects of pre-test probability on the performance of clinical tests
Description	From statistical modelling and using data collected from practice, I have been able to demonstrate the effects of knowing when patients have a high probability of disease on the performance of clinical tests applied by doctors. The example used was for x-rays but it is likely that this extends to other clinical tests and has implications about the transferability of study results into practice. The research has been submitted for publication
Type Of Material	Model of mechanisms or symptoms - human
Provided To Others?	No
Impact	The research is yet to be published but it is likely that future evaluations of clinical tests and their implementation in practice will need to take into account the results of this research The work is on-going


Title	Validation statistic
Description	This is statistic which ascertains whether the results of meta-analyses are likely to be valid in a new setting
Type Of Material	Improvements to research infrastructure
Provided To Others?	No
Impact	This has just been presented at an international conference and the work is currently under peer review The work is on-going
URL	http://2015.colloquium.cochrane.org/abstracts/are-predictions-test-accuracy-meta-analyses-valid-prac...


Title	Optimisation algorithm for bivariate random effects model
Description	This a novel algorithm developed for the handling of model commonly used in meta-analysos
Type Of Material	Computer model/algorithm
Year Produced	2019
Provided To Others?	Yes
Impact	None yet
URL	https://journals.sagepub.com/doi/10.1177/0962280219853602


Title	Tailored meta-analysis model
Description	The results of diagnostic studies may be wholly unrepresentative of particular practice settings. This is a method which allows us to decide which studies are representative
Type Of Material	Data analysis technique
Provided To Others?	No
Impact	None yet
URL	http://www.ncbi.nlm.nih.gov/pubmed/24447592


Title	Validation statistic for meta-analyses
Description	This is novel statistic designed to test whether meta-analysis estimates are likely to be valid in a new setting
Type Of Material	Data analysis technique
Year Produced	2017
Provided To Others?	Yes
Impact	Appeared on wikipedia, cited 19 times
URL	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5575530/pdf/SIM-36-3283.pdf


Description	Collaboration on use of THIN data base to produce pharmaco-epidemiology papers
Organisation	University of Birmingham
Country	United Kingdom
Sector	Academic/University
PI Contribution	Provided statistical support for the research
Collaborator Contribution	Have extracted the data for analysis
Impact	3 papers
Start Year	2016


Description	Collaboration with associate professor in dentistry
Organisation	Charité - University of Medicine Berlin
Country	Germany
Sector	Academic/University
PI Contribution	Based on previous methodology, I was approached by colleague in Germany to apply the methods to diagnosis in dentistry. I provided the methods and data analysis
Collaborator Contribution	The partner provided the data
Impact	There is a potential paper but it is still under review.
Start Year	2016


Description	Long term impact of pre-incision antibiotics on babies born by caesarean section
Organisation	University of Birmingham
Department	Institute of Applied Health Research
Country	United Kingdom
Sector	Academic/University
PI Contribution	Offered statistical and GP expertise to the application for a grant
Collaborator Contribution	The research was led by a colleague
Impact	Funding has been achieved with the NIHR
Start Year	2017


Description	Modelling survival data in large primary care data bases
Organisation	Brown University
Country	United States
Sector	Academic/University
PI Contribution	This was work on the analysis of routine data
Collaborator Contribution	They provided the computer clusters and expertise regarding analysis
Impact	A paper is expected
Start Year	2017


Description	Qrisk2 scores and prescription of statins
Organisation	University of Birmingham
Country	United Kingdom
Sector	Academic/University
PI Contribution	An investigation into the prescribing behaviour of GPs when prescribing statins SF has gathered data, TM has supervised and I have analysed the data.
Collaborator Contribution	SF has gathered data, TM has supervised and I have analysed the data.
Impact	None so far
Start Year	2019


Description	Risk of Developing Gallbladder Cancer in Patients with Gallbladder Polyps Detected on Trans-Abdominal Ultrasound Examination
Organisation	Velindre Cancer Centre
Country	United Kingdom
Sector	Hospitals
PI Contribution	This is a systematic review and meta-analysis of the TA US of gall bladder polyps to identify cancers. We have done the Bayesian modelling on the data
Collaborator Contribution	The other members of the collaboration provided the data by systematically reviewing the literature
Impact	1 abstract under submission
Start Year	2021


Title	Optimisation algorithm for the bivariate random effects model used in meta-analysis of test accuracy studies
Description	The optimisation algorithm combines the Newton-Raphson method with the profile likelihood and Observed Fisher Information to fit the bivariate random effects model used in the meta-analysis of test accruacy studies
Type Of Technology	Software
Year Produced	2019
Open Source License?	Yes
Impact	None yet
URL	https://journals.sagepub.com/doi/10.1177/0962280219853602


Description	A lecture series of 4 lectures given at Brown University
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Study participants or study members
Results and Impact	This was four lectures given on translation of test research evidence into practice - it was based on the work that I have carried out over the last 8 years
Year(s) Of Engagement Activity	2017
URL	https://www.brown.edu/academics/public-health/research/evidence-synthesis-in-health/news/2017-03/vis...


Description	A talk on a paper I wrote on 'philosophy of science and the diagnostic process' give to the Test Research Group at Exeter University in June 2018
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Postgraduate students
Results and Impact	A talk on a previously published paper on the 'philosophy of science and the diagnostic process' given to the Test Research Group at Exeter University in June 2018
Year(s) Of Engagement Activity	2018


Description	Diagnostic decision making
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Patients, carers and/or patient groups
Results and Impact	This was talk to the patient participation group for general practice surgery to give an indication of my research and how doctors make diagnostic decisions in general
Year(s) Of Engagement Activity	2017