Surgical workflow recognition and prediction to improve outcomes in robotic-assisted surgery.
Lead Research Organisation:
King's College London
Department Name: Imaging & Biomedical Engineering
Abstract
Real-time surgical phase recognition is a complex task that faces challenges due to the inherently dynamic nature of surgeries and the necessity to understand temporal dependencies. Current methodologies primarily focus on analyzing past and present information to classify surgical phases, which neglects the potential insights from future surgical actions. In response to this limitation, we propose an innovative multi-task approach, employing an auto-regressive Transformer to simultaneously optimizes phase recognition and prediction, effectively integrating retrospective and prospective data for comprehensive analysis.
Our model utilizes a generative framework to anticipate future events, employing next token prediction to enrich the Transformer architecture's contextual understanding. By fusing past, current, and predicted future information, our method aims to achieve continuous segment recognition and anticipation.
Validated on the Cholec80 and AutoLaparo datasets, our approach has demonstrated sota performance on phase recognition and long-term prediction capabilities. These advancements indicate that incorporating generative models for future information prediction can complement the efficiency of surgical workflow recognition systems, particularly in real-time settings. Our research underscores the importance of integrating anticipatory cues in the analysis of surgical videos and sets a new precedent for future research in the field.
Our model utilizes a generative framework to anticipate future events, employing next token prediction to enrich the Transformer architecture's contextual understanding. By fusing past, current, and predicted future information, our method aims to achieve continuous segment recognition and anticipation.
Validated on the Cholec80 and AutoLaparo datasets, our approach has demonstrated sota performance on phase recognition and long-term prediction capabilities. These advancements indicate that incorporating generative models for future information prediction can complement the efficiency of surgical workflow recognition systems, particularly in real-time settings. Our research underscores the importance of integrating anticipatory cues in the analysis of surgical videos and sets a new precedent for future research in the field.
Organisations
People |
ORCID iD |
| Maxence Boels (Student) |
Studentship Projects
| Project Reference | Relationship | Related To | Start | End | Student Name |
|---|---|---|---|---|---|
| EP/R513064/1 | 30/09/2018 | 29/09/2023 | |||
| 2554101 | Studentship | EP/R513064/1 | 31/05/2021 | 28/02/2025 | Maxence Boels |
| EP/T517963/1 | 30/09/2020 | 29/09/2025 | |||
| 2554101 | Studentship | EP/T517963/1 | 31/05/2021 | 28/02/2025 | Maxence Boels |