Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Lin, Shan; Qin, Fangbo; Peng, Haonan; Bly, Randall A.; Moe, Kris S.; Hannaford, Blake

doi:10.1109/LRA.2021.3096156

Computer Science > Computer Vision and Pattern Recognition

arXiv:2011.08752 (cs)

[Submitted on 17 Nov 2020 (v1), last revised 26 Jul 2021 (this version, v2)]

Title:Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Authors:Shan Lin, Fangbo Qin, Haonan Peng, Randall A. Bly, Kris S. Moe, Blake Hannaford

View PDF

Abstract:Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training. We demonstrate that our approach achieves superior performance to corresponding deeper segmentation models on two public surgery datasets.

Comments:	Published in IEEE Robotics and Automation Letters (Early Access)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2011.08752 [cs.CV]
	(or arXiv:2011.08752v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2011.08752
Related DOI:	https://doi.org/10.1109/LRA.2021.3096156

Submission history

From: Shan Lin [view email]
[v1] Tue, 17 Nov 2020 16:27:27 UTC (1,215 KB)
[v2] Mon, 26 Jul 2021 00:39:27 UTC (3,387 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators