Evidence synthesis of diagnostic test performance from a decision-making perspective

Lead Research Organisation: University of Bristol
Department Name: Social Medicine

Abstract

A diagnostic test is any kind of medical test or assessment used to determine whether a patient does or does not have a disease. A highly accurate, but also highly invasive and/or expensive, test for a disease might exist (the 'gold standard' or reference test). The accuracy of less invasive and/or expensive tests ('index tests') is obviously of great interest. Assuming that the reference test has correctly classified all patients, the accuracy of the index test can be quantified using two measures: (1) the sensitivity, defined as the proportion of diseased individuals who correctly test positive on the index test, and (2) the specificity, the proportion of non-diseased individuals who correctly test negative.

In practice, most index tests do not directly deliver a 'disease' versus 'no disease' outcome. Often the test delivers a number on a continuous scale, for example the concentration of some substance in the blood. A patient is classified as diseased if his/her test result is greater than some cut-off value. As this value is reduced over its possible range, sensitivity is increased, but at the cost of reduced specificity. A graph of this relationship is called the Receiver Operating Characteristic (ROC) curve. This can often be drawn even for tests without an explicit continuous measure since, for example, some clinicians interpreting images will tend to have a stricter definition of 'disease' than others.

Often estimates of the sensitivity and specificity of a test are available from multiple studies. As in other areas of medicine (e.g. treatment effectiveness), it is desirable to pool these estimates, to summarise all available data. This is called 'meta-analysis'. Meta-analysis of diagnostic test accuracy is often considered more complex than that in other areas, due to there being two dimensions of test accuracy. Further, although variability in e.g. patient populations and study designs is a concern throughout all areas of meta-analysis, for diagnostic test accuracy there is also the specific concern that studies are likely to have used different cut-off values. These meta-analyses therefore tend to exhibit a high degree of between-study variability. This leads to inconclusive summary evidence on test accuracy, and difficulty in making decisions or recommendations about the best testing strategies.

Standard methods for meta-analysis of diagnostic test accuracy pool only a single pair of sensitivity and specificity from each study. However, often studies report multiple points on the ROC curve. In addition, studies often report the cut-off value(s) used, but these are not usually incorporated into the model. Intuitively, these two types of information, both usually discarded, offer great potential to explain some of the between-study variability, helping us to make better sense of the evidence and, in particular, to choose the best cut-off.
Meta-analysis methods focus on producing pooled estimates of sensitivity and specificity, or a 'summary' ROC curve. But, in practice, important decisions such as whether to test, which test to use and at which cut-off should also be based on other information: e.g. the effectiveness of treatments to be provided in light of test results, the amount of disease in the population, and relevant costs. The cost-effectiveness of testing strategies can be quantified by a decision model, but there are questions regarding how best to inform this model from the meta-analysis results. In addition, there has been relatively little work to date on methods for choosing between alternative tests.

In this project, I will evaluate some sophisticated methods that have been suggested in this research area. In addition, I will work on further methods development for: (i) making full use of 'all available evidence' on testing (ii) how best to inform a decision model based on a meta-analysis of test accuracy, (iii) modelling test comparisons and choosing between multiple diagnostic tests.

Technical Summary

Aims: To evaluate, improve and develop methods for (i) making better use of 'all available data' in meta-analyses of diagnostic test accuracy, (ii) informing an economic decision model based on the synthesised evidence and (iii) comparing and choosing between competing diagnostic tests.

Objectives:
1) To investigate the circumstances in which inferences are improved by accounting for between-study correlation in sensitivity and specificity.
2) To evaluate, and potentially to extend, methods for incorporating multiple points on the ROC curve from some studies into a meta-analysis.
3) To develop methods for incorporating explicit threshold values into the meta-analysis model, when available.
4) To investigate a range of methods for informing the decision model, and produce guidelines on the appropriate approach in different scenarios.
5) To develop methods for network meta-analysis of diagnostic test accuracy.
6) To investigate the validity of test comparisons based on less-than-ideal data, for example when tests A and B have only been evaluated in separate sets of studies.
7) To perform state-of-the-art analyses of multiple real examples.
8) To develop a 2-3 day short course, to disseminate high quality methodology.

Methodology: I envision completing several of these objectives via extensions of the Rutter and Gatsonis (Statistics in Medicine, 2001) model for meta-analysis of diagnostic test accuracy, which is one of several suggested parameterisations for bivariate meta-analysis in this area. I will also draw on the literature on signal detection theory, economic decision analyses, and network meta-analysis.

Scientific Opportunities:
1) The availability of improved methodology in this area to statistical analysts
2) Better summary evidence on the accuracy of diagnostic tests, and the relative accuracy of competing tests.
3) Improved, more cost-effective, decision-making regarding testing strategies
4) My own professional development

Planned Impact

As described in the Case for Support, the current standard methods for evaluation of diagnostic tests are sub-optimal. In particular, many evidence syntheses in this area 'throw away' a lot of potentially informative data. The very high levels of unexplained heterogeneity that are often observed in meta-analyses of diagnostic test accuracy are likely to be due in part to this. Strengthened methods for analysing test performance, comparing tests and deciding how to use tests will lead to better decision making. Ultimately, patients and clinicians will benefit from improved diagnosis, and the NHS from more cost-effective use of health resources.

More directly, methods development in this area will benefit all those who have a role in making recommendations about the use of diagnostic tests: In the UK this is the National Screening Committee, the NICE Diagnostic Advisory Committee, and the NICE Centre for Clinical Practice, who are responsible for NICE Clinical Guidelines (CGs), along with the National Collaborating Centres who carry out most of the work behind CGs. In addition, manufacturers of diagnostic devices would be able to use improve methodologies in their submissions to bodies such as NICE. NICE is considered to be one of the leading organisations of its type world-wide both in terms of its processes and the excellence of its methodology. Statistical methods used at NICE have the potential to have a far-ranging impact.

In the USA, recommendations are made largely through the Agency for Healthcare Research and Quality (AHRQ). I have planned two research and training visits to the Centre for Evidence Based Medicine at Brown University, which hosts a centre designated by the AHRQ to perform technology appraisals, with a particular focus on diagnostic tests. These visits, and subsequent collaborations, will facilitate my research having an impact on AHRQ guidance for analysis of tests.

My research will also benefit the Cochrane Collaboration, and other bodies performing systematic reviews and meta-analyses of diagnostic test performance data.

I will benefit from improved research skills, a greatly increased knowledge of health economics and the processes involved in making healthcare decisions and recommendations, and the development of multiple collaborations. Each of these factors will help me to establish myself as an independent researcher, with the potential for leading subsequent programs of research and managing teams. For example, during my fellowship I aim to learn about the complex issues regarding evaluating tests and screening in the absence of a 'gold standard' test. There is a clear need for improved methodology in this area, and I envision this initial research leading to a grant or further fellowship application in the future.

Publications

10 25 50

publication icon
Clive AO (2016) Interventions for the management of malignant pleural effusions: a network meta-analysis. in The Cochrane database of systematic reviews

 
Description Coordinated and led a 2 day workshop at NICE Centre for Guidelines on meta-analysis of diagnostic test accuracy
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact Provided training to NICE analysts
 
Description Establishment (with colleagues) of new annual 2 day short course at the University of Bristol: "Introduction to Diagnostic Research"
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
URL https://www.bristol.ac.uk/medical-school/study/short-courses/introduction-to-diagnostic-research/
 
Description Invited to join panel - NICE Medtech Innovation Briefings (MIBs) panel on diagnostic tests for COVID-19
Geographic Reach National 
Policy Influence Type Membership of a guideline committee
 
Description Ran 1 day training course for NICE Guidelines
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact Contributed to knowledge/training of professional at NICE
 
Description Ran 1 day workshop 'Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy' at NICE Guidelines, London
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy: A one day workshop (at NICE)
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Developing evidence based optimal testing strategies to monitor long term conditions in primary care
Amount £1,194,693 (GBP)
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start  
 
Description HCD: Synthesis of networks of evidence on test accuracy, with and without a 'gold standard'
Amount £454,674 (GBP)
Funding ID MR/T044594/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 06/2021 
End 05/2024
 
Description Health Foundation (PI Laura Howe)
Amount £449,973 (GBP)
Organisation The Health Foundation 
Sector Charity/Non Profit
Country United Kingdom
Start 03/2018 
End 03/2021
 
Description Production of Technology Assessment Reviews (TARs) for the National Institute for Health Research (NIHR)
Amount £4,325,000 (GBP)
Funding ID NIHR131974 
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 04/2022 
End 03/2027
 
Description The benefits, harms and costs of surveillance for hepatocellular carcinoma in people with cirrhosis: synthesis of observational and diagnostic test accuracy data and cost-utility analysis
Amount £337,568 (GBP)
Funding ID NIHR134670 
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 07/2022 
End 03/2024
 
Description Wastewater analysis of traces of illicit drug-related chemicals for law enforcement and public health (WATCH)
Amount € 500,000 (EUR)
Funding ID Contract HOME/2015/ISFP/PR/DRUG/0062 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 12/2016 
End 06/2018
 
Description What is the optimum strategy for identifying adults and children with coeliac disease? A systematic review and economic models
Amount £348,750 (GBP)
Organisation National Institute for Health Research 
Sector Public
Country United Kingdom
Start 01/2020 
End 06/2021
 
Title Meta-analysis of test accuracy across all thresholds 
Description I developed a new statistical method for meta-analysis of diagnostic test accuracy across the full range of possible cutoffs for calling a test result 'positive'. 
Type Of Material Data analysis technique 
Year Produced 2016 
Provided To Others? No  
Impact Statistical methodology paper currently under revision for resubmission to the journal 'Statistics in Medicine'. This work has also resulted in several new collaborations and papers in progress (with researchers in Canada and Germany). 
 
Description COST ACTION CA18208: Novel tools for test evaluation and disease prevalence estimation 
Organisation University of Thessaly
Country Greece 
Sector Academic/University 
PI Contribution I was one of the (many) proposers of this new (2019 onwards) COST Action and am now one of 2 Management Committee members for the UK
Collaborator Contribution Funding proposal and management of this award led by Polychronis Kostoulas, University of Thessaly, Greece. He initiated this new pan-European collaboration, aiming to promote use of and provide training in use of latent class models for diagnostic test evaluation and prevalence estimation.
Impact 2 day training school on Latent Class Models in Athens, February 2020. Multi-disciplinary event for epidemiologists, statisticians, clinicians and veterinarians
Start Year 2019
 
Description Collaboration with University Medical Center Hamburg-Eppendorf and University of Freiburg 
Organisation University Medical Center Freiburg
Country Germany 
Sector Hospitals 
PI Contribution Statistical analysis, wrote section of publication and contributed to revisions of other sections
Collaborator Contribution Collaborator Zapf at University Medical Center Hamburg-Eppendorf led on a comparison of statistical methods for meta-analysis of diagnostic test accuracy (including a method I developed as part of my MRC fellowship). Other collaborators on this project included Ruecker at University of Freiburg.
Impact Zapf A, Albert C, Frömke C, Haase M, Hoyer A, Jones HE, Rücker G. 'Meta-analysis of diagnostic accuracy studies with multiple thresholds - comparison of different approaches'. Biometrical Journal, 2021
Start Year 2017
 
Description Collaboration with University Medical Center Hamburg-Eppendorf and University of Freiburg 
Organisation University Medical Center Hamburg-Eppendorf
Country Germany 
Sector Hospitals 
PI Contribution Statistical analysis, wrote section of publication and contributed to revisions of other sections
Collaborator Contribution Collaborator Zapf at University Medical Center Hamburg-Eppendorf led on a comparison of statistical methods for meta-analysis of diagnostic test accuracy (including a method I developed as part of my MRC fellowship). Other collaborators on this project included Ruecker at University of Freiburg.
Impact Zapf A, Albert C, Frömke C, Haase M, Hoyer A, Jones HE, Rücker G. 'Meta-analysis of diagnostic accuracy studies with multiple thresholds - comparison of different approaches'. Biometrical Journal, 2021
Start Year 2017
 
Description Collaboration with academics at Brown University, USA 
Organisation Brown University
Country United States 
Sector Academic/University 
PI Contribution I visited Brown University, USA, in 2016 and again in 2019, to establish and work on this new collaboration. I contributed ideas for new methodology and led writing of publications.
Collaborator Contribution My collaborators discussed my ideas with me and helped me to make significant improvements to the methods I proposed during my research visits, drawing on their own experiences and expertise. They supported me in obtaining a formal position as Visiting Scholar throughout my fellowship, which allowed me to use Brown School of Public Health facilities and offices during my research visits.
Impact One peer reviewed publication to date (at least one more to follow): Jones HE, Gatsonis CA, Trikalinos TA, Welton NJ, Ades AE. 'Quantifying how diagnostic test accuracy depends on threshold in a meta-analysis' Statistics in Medicine, 38(24):4789-4803, 2019 Gatsonis and Trikalinos are both Professors at the School of Public Health, Brown University, USA
Start Year 2016
 
Description International collaboration, led by McGill University, Canada 
Organisation McGill University
Country Canada 
Sector Academic/University 
PI Contribution Statistical analysis, drafted section of manuscript and contributed to revision of other sections
Collaborator Contribution Collaborators at McGill led on a comparison of different statistical methods for meta-analysis of test accuracy (including one that I developed as part of my MRC fellowship), as applied to a large data set, and co-ordinated writing of the paper.
Impact Publication: Benedetti A, Levis B, Rücker G, Jones HE, Schumacher M, Ioannidis JPA, Thombs B, and the DEPRESsion Screening Data (DEPRESSD) Collaboration. 'An empirical comparison of three methods for multiple cut-off diagnostic test meta-analysis of the Patient Health Questionnaire-9 (PHQ-9) depression screening tool using published data versus individual level data'. Research Synthesis Methods, 11(6):833-848, 2020
Start Year 2017
 
Description 'Use of a random effects meta-analysis in the design and analysis of a new clinical trial' Oral presentation at the Society for Research Synthesis Methodology conference. Florence, Italy 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Oral presentation to other experts in evidence synthesis methodology - at annual conference of the Society for Research Synthesis Methods
Year(s) Of Engagement Activity 2016
 
Description Gave a 90 minute workshop at the Global Evidence Summit (Meta-analysis of diagnostic test accuracy studies for healthcare policy and decision making) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Ran (together with Dr Rhiannon Owen, University of Leicester, UK) a 90 minute workshop titled "Meta-analysis of diagnostic test accuracy studies for healthcare policy and decision making" at the Global Evidence Summit, Cape Town, South Africa, September 2017
Year(s) Of Engagement Activity 2017
 
Description Invited seminar at University of Leicester, December 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Invited seminar: 'Meta-analysis of diagnostic test accuracy across all possible cut-offs and selection of the optimal cut-off'. Biostatistics seminar at the University of Leicester. December 2019
Year(s) Of Engagement Activity 2019
 
Description Invited seminar at University of York, June 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Invited seminar: 'Meta-analysis of diagnostic test accuracy across all possible cut-offs and selection of the optimal cut-off'. Seminar at the Centre for Health Economics, University of York. June 2019.
Year(s) Of Engagement Activity 2019
 
Description Talk at "Methods for Evaluation of medical prediction Models, Tests And Biomarkers" (MEMTAB) conference, 2018 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Dr H Jones gave oral presentation at "Methods for Evaluation of medical prediction Models, Tests And Biomarkers" (MEMTAB) conference.
Title: "Quantifying how diagnostic test accuracy depends on threshold in a meta-analysis"
Utrecht, Netherlands. 2018
Year(s) Of Engagement Activity 2018
 
Description Talk at "Society for Research Synthesis Methodology" (SRSM) annual conference, 2018 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Talk at SRSM (international methodological society) annual conference.
Audience: academics with an interest in evidence synthesis methodology
Location: Bristol, UK
Date: July 2018
Talk title: "Meta-analysis of diagnostic test accuracy: a flexible Bayesian model for multiple and explicit thresholds"
Year(s) Of Engagement Activity 2018
 
Description Talk at the International Health Economics Association (iHEA) congress, Basel, Switzerland, 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I gave an oral presentation in an organised session on test evaluation at this international health economics conference. My title was "Meta-analysis of test accuracy across multiple thresholds for decision making"
Year(s) Of Engagement Activity 2019
 
Description Talk at the Methods for Economic Evaluation of Diagnostics (MEED) Research Forum, Manchester, UK. 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Gave an oral presentation titled "Meta-analysis of test accuracy across multiple thresholds for decision making" at this one day research forum. Audience mostly academic - plus some industry representatives and guideline developers
Year(s) Of Engagement Activity 2019
 
Description Talk on methods for meta-analysis of diagnostic test accuracy - ISCB conference, Spain 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I presented at the International Society for Clinical Biostatistics (ISCB) conference in Vigo, Spain. Title: "Quantifying how test accuracy depends on threshold in a meta-analysis"
Year(s) Of Engagement Activity 2017
 
Description Talk on methods for meta-analysis of diagnostic test accuracy at the Global Evidence Summit 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Gave talk at the Global Evidence Summit (Cape Town, South Africa), titled 'Quantifying how test accuracy depends on threshold in a meta-analysis'
Year(s) Of Engagement Activity 2017