Semi-supervised detection and tracking of instruments for robotic surgery guidance

Lead Research Organisation: King's College London
Department Name: Imaging & Biomedical Engineering

Abstract

Aim of the PhD Project:

Robust, real-time detection and tracking of surgical tools
Learning from combined small-scale annotated and large-scale but non-annotated datasets of robotic surgery video footages
Advancing the state of the art in combining self-supervision, week-supervision, and semi-supervision for surgical vision tasks
Designing and validating stereo-vision based learning paradigms.

Project Description:

This project seeks to advance the state of the art in AI-based surgical tool detection and tracking by designing novel semi-supervised and weakly-supervised approaches able to achieve robust and real-time performance.

Automatic detection and tracking of surgical tools from laparoscopic surgery videos is bound to play a key role in enabling surgery with autonomous features and for providing advanced surgical assistance to the clinical team. Being able to know how many and where the instruments are has a wealth of applications such as: placing informative overlays on the screen; performing augmented reality without occluding instruments; surgical workflow analysis; visual servoing; surgical task automation; etc.

When the tools and vision system (endoscope) are robotically manipulated, one could expect the kinematic information of the robot to provide accurate tool positioning information. In practice, the large number of joints, cables and other sources of slack and hysteresis in the robots make the translation of kinematic information into calibrated positioning data mostly unrepeatable and prone to error. An alternative is to use the endoscope itself, which is already present, as a sensor. One of the first methods devised to help in the surgical tool detection process from videos was the placement of fiducial markers on the instruments. If the markers are remarkably different from observed tissue, the detection task would likely be solved through very simple processing. Yet, adding fiducials on surgical instruments has been strongly rejected by the medical device manufacturers as it presents diverse and important disadvantages (e.g. sterilisation, positioning, occlusion from blood, etc.)

Training deep convolutional networks from manually annotated datasets of instrument detection and tracking in surgical scenes is a promising approach for this task. Yet, producing manual annotations of surgical videos is tedious, time-consuming and non-scalable given the evolutive nature of surgical technology and the expertise required for the annotation task. No large datasets comparable to industry standards in the computer vision community are available for these tasks. The lack of large-scale annotated datasets will remain a rate-limiting factor to achieve the robustness and accuracy required to use surgical detection and tracking in patient critical task such as autonomous surgery. As in the autonomous driving research field, automating the creation of synthetic realistic training datasets by exploiting advanced simulation may help improve the performance of deep learning approaches but a gap is likely to remain if only small amounts of real data are being exploited in addition to simulations.

This project aims at exploiting weakly-supervised, self-supervised and semi-supervised learning to exploit large amounts of non-annotated real datasets combined with small-scale manually annotated datasets, thereby keeping the required annotation effort low. Although each of these approaches have been developed in the literature and some of them have been evaluated on surgical videos, the challenging open research questions addressed in this proposal relate to optimally combining these in a methodologically sound and scalable framework.

.

Planned Impact

Strains on the healthcare system in the UK create an acute need for finding more effective, efficient, safe, and accurate non-invasive imaging solutions for clinical decision-making, both in terms of diagnosis and prognosis, and to reduce unnecessary treatment procedures and associated costs. Medical imaging is currently undergoing a step-change facilitated through the advent of artificial intelligence (AI) techniques, in particular deep learning and statistical machine learning, the development of targeted molecular imaging probes and novel "push-button" imaging techniques. There is also the availability of low-cost imaging solutions, creating unique opportunities to improve sensitivity and specificity of treatment options leading to better patient outcome, improved clinical workflow and healthcare economics. However, a skills gap exists between these disciplines which this CDT is aiming to fill.

Consistent with our vision for the CDT in Smart Medical Imaging to train the next generation of medical imaging scientists, we will engage with the key beneficiaries of the CDT: (1) PhD students & their supervisors; (2) patient groups & their carers; (3) clinicians & healthcare providers; (4) healthcare industries; and (5) the general public. We have identified the following areas of impact resulting from the operation of the CDT.

- Academic Impact: The proposed multidisciplinary training and skills development are designed to lead to an appreciation of clinical translation of technology and generating pathways to impact in the healthcare system. Impact will be measured in terms of our students' generation of knowledge, such as their research outputs, conference presentations, awards, software, patents, as well as successful career destinations to a wide range of sectors; as well as newly stimulated academic collaborations, and the positive effect these will have on their supervisors, their career progression and added value to their research group, and the universities as a whole in attracting new academic talent at all career levels.

- Economic Impact: Our students will have high employability in a wide range of sectors thanks to their broad interdisciplinary training, transferable skills sets and exposure to industry, international labs, and the hospital environment. Healthcare providers (e.g. the NHS) will gain access to new technologies that are more precise and cost-efficient, reducing patient treatment and monitoring costs. Relevant healthcare industries (from major companies to SMEs) will benefit and ultimately profit from collaborative research with high emphasis on clinical translation and validation, and from a unique cohort of newly skilled and multidisciplinary researchers who value and understand the role of industry in developing and applying novel imaging technologies to the entire patient pathway.

- Societal Impact: Patients and their professional carers will be the ultimate beneficiaries of the new imaging technologies created by our students, and by the emerging cohort of graduated medical imaging scientists and engineers who will have a strong emphasis on patient healthcare. This will have significant societal impact in terms of health and quality of life. Clinicians will benefit from new technologies aimed at enabling more robust, accurate, and precise diagnoses, treatment and follow-up monitoring. The general public will benefit from learning about new, cutting-edge medical imaging technology, and new talent will be drawn into STEM(M) professions as a consequence, further filling the current skills gap between healthcare provision and engineering.

We have developed detailed pathways to impact activities, coordinated by a dedicated Impact & Engagement Manager, that include impact training provision, translational activities with clinicians and patient groups, industry cooperation and entrepreneurship training, international collaboration and networks, and engagement with the General Public.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S022104/1 01/10/2019 31/03/2028
2606550 Studentship EP/S022104/1 01/10/2021 30/09/2025 Meng Wei