Clinically Interpretable AI Diagnosis of Diabetic Retinopathy from Fundus Images using Vision Transformers

Lead Research Organisation: University College London
Department Name: Institute of Health Informatics

Abstract

A recent paper by Dosovitskiy et al.demonstrated the successful implementation of transformers for vision classification tasks. This paper showed the transformer has high performance and a small training when compared to commonly used convolutional neural networks (CNN). It was also shown that attention of the transformer for a given input could be visualised, see Fig 1, allowing for a level of interpretability of the model's outputs. This is a feature not common in standard CNN models.

he transfer of this transformer model (for which the code and model pre-trained on ImageNet 21k is publicly available to an ophthalmic diagnostic classification task, see Section 2, could potentially result in a clinical interpretability model with state-of-the-art performance. This would be achieved with no further supervision than a class label. In the extreme, if transformers prove to outperform and replace CNNs in vision task, as in the NLP domain, this project would be at the frontier of this change. Furthermore this project directly ties into the CDT's research theme of 'AI-enabled diagnostics or prognostics' and would be the first example of the use of a vision transformer in the medical domain.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S021612/1 01/04/2019 30/09/2027
2418785 Studentship EP/S021612/1 28/09/2020 30/09/2024 Simon Ellershaw