Estimating uncertainties in cancer imaging using deep learning methods

Lead Research Organisation: University of Oxford

Abstract

Making automatic intelligent systems robust is one of the key challenges in medical applications.When basing healthcare decisions on outcomes of machine learning (ML) methods, validating their robustness and safety is vital. Part of this is estimating the uncertainties arising from the data and from the model itself. Distinguishing between (e.g., aleatoric and epistemic) uncertainties influencing decisions enables us to evaluate and handle them more robustly.Deep learning (DL) has achieved an unprecedented performance in many areas of machine learning, in particular in computer vision and has thus seen increasing interest from medical researchers.Whereas classical probabilistic ML methods can estimate model uncertainty, this has proven to be a challenge in deep learning. In recent years, several methods have been developed to estimate uncertainty in deep neural networks (DNN): for instance, Monte-Carlo dropout, deep ensembles, or deterministic uncertainty quantification [1,2,3]. Each of these methods approach uncertainty in fundamentally different ways, leading to different theoretical properties of the estimates, as well as practical values and application contexts.Some of these methods have been applied in medical settings: For example, one paper tested diagnosing people with varying degrees of diabetic retinopathy using retina scans [4] or another attempted to classify histopathological images of colorectal cancer [5]. In these papers, Bayesian deep learning is, e.g., used to inform active learning and compared to other methods. The results look promising. However, there are still gaps. For example, uncertainty is commonly defined in terms
of the entropy of the predictive distribution, which is a difficult choice when communicating results with clinicians. Furthermore, none of the papers embed model predictions in a decision framework applicable to clinical practice. Deep learning has the potential to be involved in many stages of diagnosing cancer. Primarily, it is being applied to medical imaging (such as histology, X-Ray, CT, and MRI) but might have potential in multimodal datasets.Aim 1: Understand the characteristics of suitable uncertainty estimates in the medical context. One key requirement for obtaining good uncertainty estimates is understanding what characterizes these as "good". We will explore different theoretical groundings of estimators and their properties. For instance, some desirable properties, such as consistency, have not been explored for some proposed estimators. In addition, practical experiments exploring model output of corrupted x-ray
images (e.g., distorted images) could lead to insights into the practical reliability and validity of estimators. Aim 2: Provide direct insights into cancer diagnostics. By examining methods on radiology images, this project could provide direct insight into cancer diagnostic pathways. While most experiments are methodology-focused, the knowledge could be transferred into an application-oriented model improving on current attempts to automatically detect abnormalities in chest x-ray images or thoracic CT. Such a model could combine multiple technologies, i.e., utilize explainability methods, uncertainty estimates, and rely on multimodal datato offer a human-interpretable, robust, and state-of-the-art performance.Establishing a deep learning framework that overcomes current limitations facilitates a safe use of deep learning in medical practice. An automated radiology model, for instance, can have a direct impact on patient well-being by providing a diagnostic tool with great precision and robustness, which is accessible and reduces healthcare cost. Such expert systems can have a particular impact inregions with a shortage of qualified radiologists.
This project falls within the EPSRC Medical Imaging" and "Statistics and Applied Probability"
research areas.

Planned Impact

In the same way that bioinformatics has transformed genomic research and clinical practice, health data science will have a dramatic and lasting impact upon the broader fields of medical research, population health, and healthcare delivery. The beneficiaries of the proposed training programme, and of the research that it delivers and enables, will include academia, industry, healthcare, and the broader UK economy.

Academia: Graduates of the training programme will be well placed to start their post-doctoral careers in leading academic institutions, engaging in high-impact multi-disciplinary research, helping to build training and research capacity, sharing their experience within the wider academic community.

Industry: Partner organisations will benefit from close collaboration with leading researchers, from the joint exploration of research priorities, and from the commercialisation of arising intellectual property. Other organisations will benefit from the availability of highly-qualified graduates with skills in big health data analytics.

Healthcare: Healthcare organisations and patients will benefit from the results of enabled and accelerated health research, leading to new treatments and technologies, and an improved ability to identify and evaluate potential improvements in practice through the analysis of real-world health data.

Economy: The life sciences sector is a key component of the UK economy. The programme will provide partner companies with direct access to leading-edge research. Graduates of the programme will be well-qualified to contribute to economic growth - supporting health research and the development of new products and services - and will be able to inform policy and decision making at organisational, regional, and national levels.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S02428X/1 01/04/2019 30/09/2027
2280532 Studentship EP/S02428X/1 01/10/2019 31/12/2023 Cornelius Emde