Probabilistic deep learning approaches in medical imaging

Lead Research Organisation: University of Oxford
Department Name: Sustain Approach to Biomedical Sci CDT

Abstract

In medicine today, the use of imaging data plays an indispensable role across all aspects of care and spans most medical domains. The effective analysis of this data is paramount, requiring accurate, reliable, and efficient tools. Artificial Intelligence (AI) has emerged as a transformative solution, extensively applied across diverse imaging modalities and diseases. AI has notably enhanced the precision and efficiency of image analysis while reducing the burden on clinicians. However, a challenge persists in the form of AI models struggling to generalize across disparate data sources, such as different hospitals or imaging devices, and adapting to different diagnostic tasks.

One promising avenue to address these challenges is the application of statistical and probabilistic approaches to AI models. These approaches can enhance model robustness and reliability, but have yet to be fully explored, particularly within the realm of medical imaging. This research gap forms the basis of my DPhil project. My project will focus on the development and evaluation of probabilistic deep learning tools tailored to medical imaging, using methods from both frequentist and Bayesian statistics. This includes building novel models which are robust to out-of-distribution data, can be trusted, and from which causal, correlative, and confounding effects can be distinguished.

A particular focus will be placed on the development of models which can associate a confidence estimate or distribution to each of their predictions. To date, most classical deep learning models only provide a point-estimate of their prediction, and "don't know what they don't know", which can be particularly problematic when they are presented with images from a different distribution to those with which they were trained. Quantifying uncertainty is key to models being trusted by clinicians, especially in decision-sensitive contexts such as healthcare. Despite this, existing medical imaging models which quantify uncertainty, such as Bayesian neural networks or Monte Carlo dropout methods, often incur significantly higher computational costs, as they involve calculating a distribution over each of the models' weights and training a series of networks with different activations for each layer respectively. This project will therefore aim to develop such models which can reliably estimate their uncertainty while maintaining prediction accuracy and low computational cost.

I will primarily use multimodal Positron Emission Tomography and Computed Tomography (PET/CT) data of patients with tumours in order to develop and test these models. This type of data would particularly benefit from uncertainty-aware models as there is extensive inter-scanner variability as well as variability in the interpretation of the scans by clinicians. This would benefit the wider deep learning research community, as code would be open-sourced and methods shared, as well as the clinic, by providing safer and more trustworthy methods. For instance, this would be important in the use of PET/CT tumour segmentation to guide radiotherapy, as having a map of the uncertainty across the predicted tumour area would avoid targeting of any potentially healthy areas, eg at the margins, which are notoriously harder to segment. The company GE Healthcare will be involved in the project as the industrial partner and will also help provide curated dataset(s) that I can work with.

This project aligns with EPSRC's strategies and research areas. Specifically, this project falls within the following EPSRC research areas:

- Artificial intelligence technologies
- Image and vision computing
- Medical imaging
- Statistics and applied probability.

Planned Impact

The UK's world-leading position in biomedical research is critically dependent upon training scientists with the cutting-edge research skills and technological know-how needed to drive future scientific advances. Since 2009, the EPSRC and MRC CDT in Systems Approaches to Biomedical Science (SABS) has been working with its consortium of 22 industrial and institutional partners to meet this training need.

Over this period, our partners have identified a growing training need caused by the increasing reliance on computational approaches and research software. The new EPSRC CDT in Sustainable Approaches to Biomedical Science: Responsible and Reproducible Research - SABS:R^3 will address this need. By embedding a sustainable approach to software and computational model development into all aspects of the existing SABS training programme, we aim to foster a culture change in how the computational tools and research software that now underpin much of biomedical research are developed, and hence how quantitative and predictive translational biomedical research is undertaken.

As with all CDT Programmes, the future impact of SABS:R^3 will be through its alumni, and by the culture change that its training engenders. By these measures, our existing SABS CDT is already proving remarkably successful. Our alumni have gone on to a wide range of successful careers, 21 in academic research, 19 in industry (including 5 in SABS partner companies) and the other 10 working in organisations from the Office of National Statistics to the EPSRC. SABS' unique Open Innovation framework has facilitated new company connections and a high level of operational freedom, facilitating 14 multi-company, pre-competitive, collaborative doctoral research projects between 11 companies, each focused on a SABS student.

The impact of sustainable and open computational approaches on biomedical research is clear from existing SABS' student projects. Examples include SAbDab which resulted from the first-ever co-sponsored doctorate in SABS, by UCB and Roche. It was released as open source software, is embedded in the pipelines of several pharmaceutical companies (including UCB, Medimmune, GSK, and Lonza) and has resulted in 13 papers. The SABS student who developed SAbDab was initially seconded to MedImmune, sponsored by EPSRC IAA funding; he went on to work at Roche, and is now at BenevolentAI. Similarly, PanDDA, multi-dataset X-ray crystallographic software to detect ligand-bound states in protein complexes is in CCP4 and is an integral part of Diamond Light Source's XChem Pipeline. The SABS student who developed PanDDA was awarded an EMBO Fellowship.

Future SABS:R^3 students will undertake research supported by both our industrial partners and academic supervisors. These supervisors have a strong track record of high impact research through the release of open source software, computational tools, and databases, and through commercialisation and licensing of their research. All of this research has been undertaken in collaboration with industrial partners, with many examples of these tools now in routine use within partner companies.

The newly focused SABS:R^3 will permit new industrial collaborations. Six new partners have joined the consortium to support this new bid, ranging from major multinationals (e.g. Unilever) to SMEs (e.g. Lhasa). SABS:R^3 will continue to make all of its research and teaching resources publicly available and will continue to help to create other centres with similar aims. To promote a wider cultural change, the SABS:R^3 will also engage with the academic publishing industry (Elsevier, OUP, and Taylor & Francis). We will explore novel ways of disseminating the outputs of computational biomedical research, to engender trust in the released tools and software, facilitate more uptake and re-use.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024093/1 01/10/2019 31/03/2028
2736482 Studentship EP/S024093/1 01/10/2022 30/09/2026 Anissa Alloula