Sun Xinan, Zou Yuelin, Wang Shuxin, Su He, Guan Bo
Key Laboratory of Mechanism Theory and Equipment Design of Ministry of Education, Tianjin University, 135 Yaguan Road, Tianjin, 300350, China.
School of Mechanical Engineering, Tianjin University, 135 Yaguan Road, Jinnan District, Tianjin, 300350, China.
Int J Comput Assist Radiol Surg. 2022 Oct;17(10):1903-1913. doi: 10.1007/s11548-022-02687-z. Epub 2022 Jun 10.
Automatic image segmentation of surgical instruments is a fundamental task in robot-assisted minimally invasive surgery, which greatly improves the context awareness of surgeons during the operation. A novel method based on Mask R-CNN is proposed in this paper to realize accurate instance segmentation of surgical instruments.
A novel feature extraction backbone is built, which could extract both local features through the convolutional neural network branch and global representations through the Swin-Transformer branch. Moreover, skip fusions are applied in the backbone to fuse both features and improve the generalization ability of the network.
The proposed method is evaluated on the dataset of MICCAI 2017 EndoVis Challenge with three segmentation tasks and shows state-of-the-art performance with an mIoU of 0.5873 in type segmentation and 0.7408 in part segmentation. Furthermore, the results of ablation studies prove that the proposed novel backbone contributes to at least 17% improvement in mIoU.
The promising results demonstrate that our method can effectively extract global representations as well as local features in the segmentation of surgical instruments and improve the accuracy of segmentation. With the proposed novel backbone, the network can segment the contours of surgical instruments' end tips more precisely. This method can provide more accurate data for localization and pose estimation of surgical instruments, and make a further contribution to the automation of robot-assisted minimally invasive surgery.
手术器械的自动图像分割是机器人辅助微创手术中的一项基础任务,可极大提高外科医生在手术过程中的情境感知能力。本文提出一种基于Mask R-CNN的新方法,以实现手术器械的精确实例分割。
构建了一种新颖的特征提取主干网络,其既能通过卷积神经网络分支提取局部特征,又能通过Swin-Transformer分支提取全局特征表示。此外,在主干网络中应用跳跃融合来融合两种特征并提高网络的泛化能力。
在MICCAI 2017 EndoVis挑战赛数据集上对所提方法进行了三项分割任务的评估,在类型分割中平均交并比达到0.5873,在部分分割中达到0.7408,展现出了领先的性能。此外,消融研究结果证明所提的新型主干网络使平均交并比至少提高了17%。
这些有前景的结果表明,我们的方法能够在手术器械分割中有效地提取全局特征表示以及局部特征,并提高分割精度。借助所提的新型主干网络,该网络能够更精确地分割手术器械端部的轮廓。此方法可为手术器械的定位和姿态估计提供更准确的数据,并为机器人辅助微创手术的自动化做出进一步贡献。