• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于机器人手术中参考视频器械分割的视频-器械协同网络

Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.

作者信息

Wang Hongqiu, Yang Guang, Zhang Shichen, Qin Jing, Guo Yike, Xu Bo, Jin Yueming, Zhu Lei

出版信息

IEEE Trans Med Imaging. 2024 Dec;43(12):4457-4469. doi: 10.1109/TMI.2024.3426953. Epub 2024 Dec 2.

DOI:10.1109/TMI.2024.3426953
PMID:38990752
Abstract

Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (https://github.com/whq-xxh/RSVIS).

摘要

手术器械分割对于促进机器人辅助手术中的认知智能至关重要。尽管现有方法已取得准确的器械分割结果,但它们同时生成所有器械的分割掩码,缺乏指定目标对象并提供交互式体验的能力。本文聚焦于机器人手术中一项新颖且重要的任务,即参考手术视频器械分割(RSVIS),其旨在根据给定的语言表达从每个视频帧中自动识别并分割出目标手术器械。这种交互式功能增强了用户参与度并提供了定制化体验,极大地有利于下一代手术教育系统的发展。为实现这一目标,本文构建了两个手术视频数据集以推动RSVIS研究。然后,我们设计了一种新颖的视频 - 器械协同网络(VIS - Net)来学习视频级和器械级知识以提升性能,而先前的工作仅利用了视频级信息。同时,我们设计了一个基于图的关系感知模块(GRM)来对多模态信息(即文本描述和视频帧)之间的相关性进行建模,以促进器械级信息的提取。在两个RSVIS数据集上的大量实验结果表明,VIS - Net能够显著优于现有的最先进的参考分割方法。我们将发布代码和数据集以供未来研究使用(https://github.com/whq-xxh/RSVIS)。

相似文献

1
Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.用于机器人手术中参考视频器械分割的视频-器械协同网络
IEEE Trans Med Imaging. 2024 Dec;43(12):4457-4469. doi: 10.1109/TMI.2024.3426953. Epub 2024 Dec 2.
2
Point-cloud segmentation with in-silico data augmentation for prostate cancer treatment.用于前列腺癌治疗的基于计算机模拟数据增强的点云分割
Med Phys. 2025 Apr 3. doi: 10.1002/mp.17815.
3
TUNeS: A Temporal U-Net With Self-Attention for Video-Based Surgical Phase Recognition.TUNeS:一种用于基于视频的手术阶段识别的带自注意力机制的时态U-Net。
IEEE Trans Biomed Eng. 2025 Jul;72(7):2105-2119. doi: 10.1109/TBME.2025.3535228.
4
Short-Term Memory Impairment短期记忆障碍
5
A segment anything model-guided and match-based semi-supervised segmentation framework for medical imaging.一种用于医学成像的基于段式分割模型引导和匹配的半监督分割框架。
Med Phys. 2025 Mar 29. doi: 10.1002/mp.17785.
6
Diffusion semantic segmentation model: A generative model for medical image segmentation based on joint distribution.扩散语义分割模型:一种基于联合分布的医学图像分割生成模型。
Med Phys. 2025 Jul;52(7):e17928. doi: 10.1002/mp.17928. Epub 2025 Jun 8.
7
Rethinking data imbalance in class incremental surgical instrument segmentation.
Med Image Anal. 2025 Oct;105:103728. doi: 10.1016/j.media.2025.103728. Epub 2025 Jul 22.
8
PDZSeg: adapting the foundation model for dissection zone segmentation with visual prompts in robot-assisted endoscopic submucosal dissection.PDZSeg:通过视觉提示在机器人辅助内镜黏膜下剥离术中调整基础模型用于剥离区域分割
Int J Comput Assist Radiol Surg. 2025 Jun 20. doi: 10.1007/s11548-025-03437-7.
9
Boundary-aware information maximization for self-supervised medical image segmentation.用于自监督医学图像分割的边界感知信息最大化
Med Image Anal. 2024 May;94:103150. doi: 10.1016/j.media.2024.103150. Epub 2024 Mar 28.
10
Multi-level channel-spatial attention and light-weight scale-fusion network (MCSLF-Net): multi-level channel-spatial attention and light-weight scale-fusion transformer for 3D brain tumor segmentation.多级通道空间注意力与轻量级尺度融合网络(MCSLF-Net):用于3D脑肿瘤分割的多级通道空间注意力与轻量级尺度融合变换器
Quant Imaging Med Surg. 2025 Jul 1;15(7):6301-6325. doi: 10.21037/qims-2025-354. Epub 2025 Jun 30.