• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对抗时空差异:基于对比学习的手术流程识别网络。

Against spatial-temporal discrepancy: contrastive learning-based network for surgical workflow recognition.

机构信息

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.

University of Chinese Academy of Sciences, Beijing, China.

出版信息

Int J Comput Assist Radiol Surg. 2021 May;16(5):839-848. doi: 10.1007/s11548-021-02382-5. Epub 2021 May 5.

DOI:10.1007/s11548-021-02382-5
PMID:33950398
Abstract

PURPOSE

Automatic workflow recognition from surgical videos is fundamental and significant for developing context-aware systems in modern operating rooms. Although many approaches have been proposed to tackle challenges in this complex task, there are still many problems such as the fine-grained characteristics and spatial-temporal discrepancies in surgical videos.

METHODS

We propose a contrastive learning-based convolutional recurrent network with multi-level prediction to tackle these problems. Specifically, split-attention blocks are employed to extract spatial features. Through a mapping function in the step-phase branch, the current workflow can be predicted on two mutual-boosting levels. Furthermore, a contrastive branch is introduced to learn the spatial-temporal features that eliminate irrelevant changes in the environment.

RESULTS

We evaluate our method on the Cataract-101 dataset. The results show that our method achieves an accuracy of 96.37% with only surgical step labels, which outperforms other state-of-the-art approaches.

CONCLUSION

The proposed convolutional recurrent network based on step-phase prediction and contrastive learning can leverage fine-grained characteristics and alleviate spatial-temporal discrepancies to improve the performance of surgical workflow recognition.

摘要

目的

从外科手术视频中自动识别工作流程对于开发现代手术室中的上下文感知系统至关重要。尽管已经提出了许多方法来解决这一复杂任务中的挑战,但仍然存在许多问题,例如手术视频中的细粒度特征和时空差异。

方法

我们提出了一种基于对比学习的卷积递归网络,具有多层次预测,以解决这些问题。具体来说,采用分割注意力块来提取空间特征。通过在步骤阶段分支中的映射函数,可以在两个相互增强的级别上预测当前的工作流程。此外,引入了一个对比分支来学习时空特征,以消除环境中的无关变化。

结果

我们在 Cataract-101 数据集上评估了我们的方法。结果表明,我们的方法仅使用手术步骤标签即可达到 96.37%的准确率,优于其他最先进的方法。

结论

基于步骤阶段预测和对比学习的提出的卷积递归网络可以利用细粒度特征并减轻时空差异,从而提高手术工作流程识别的性能。

相似文献

1
Against spatial-temporal discrepancy: contrastive learning-based network for surgical workflow recognition.对抗时空差异:基于对比学习的手术流程识别网络。
Int J Comput Assist Radiol Surg. 2021 May;16(5):839-848. doi: 10.1007/s11548-021-02382-5. Epub 2021 May 5.
2
SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network.SV-RCNet:基于递归卷积网络的手术视频工作流程识别
IEEE Trans Med Imaging. 2018 May;37(5):1114-1126. doi: 10.1109/TMI.2017.2787657.
3
Surgical workflow recognition with temporal convolution and transformer for action segmentation.基于时间卷积和Transformer的手术流程识别用于动作分割
Int J Comput Assist Radiol Surg. 2023 Apr;18(4):785-794. doi: 10.1007/s11548-022-02811-z. Epub 2022 Dec 21.
4
LRTD: long-range temporal dependency based active learning for surgical workflow recognition.基于长程时间依赖的主动学习在手术流程识别中的应用
Int J Comput Assist Radiol Surg. 2020 Sep;15(9):1573-1584. doi: 10.1007/s11548-020-02198-9. Epub 2020 Jun 25.
5
Temporal-based Swin Transformer network for workflow recognition of surgical video.用于手术视频工作流识别的基于时间的Swin Transformer网络
Int J Comput Assist Radiol Surg. 2023 Jan;18(1):139-147. doi: 10.1007/s11548-022-02785-y. Epub 2022 Nov 4.
6
Semi-supervised learning with progressive unlabeled data excavation for label-efficient surgical workflow recognition.基于渐进式未标记数据挖掘的半监督学习在标签高效手术流程识别中的应用。
Med Image Anal. 2021 Oct;73:102158. doi: 10.1016/j.media.2021.102158. Epub 2021 Jul 8.
7
Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures.多任务时频卷积网络联合识别胃旁路手术中的手术阶段和步骤。
Int J Comput Assist Radiol Surg. 2021 Jul;16(7):1111-1119. doi: 10.1007/s11548-021-02388-z. Epub 2021 May 19.
8
Global-local multi-stage temporal convolutional network for cataract surgery phase recognition.用于白内障手术阶段识别的全局-局部多阶段时间卷积网络。
Biomed Eng Online. 2022 Nov 30;21(1):82. doi: 10.1186/s12938-022-01048-w.
9
Cross-view motion consistent self-supervised video inter-intra contrastive for action representation understanding.跨视图运动一致的自我监督视频内-外对比动作表示理解。
Neural Netw. 2024 Nov;179:106578. doi: 10.1016/j.neunet.2024.106578. Epub 2024 Jul 26.
10
Assisted phase and step annotation for surgical videos.辅助手术视频的阶段和步骤标注。
Int J Comput Assist Radiol Surg. 2020 Apr;15(4):673-680. doi: 10.1007/s11548-019-02108-8. Epub 2020 Feb 10.

引用本文的文献

1
Dynamic data balancing strategy-based Xception-dual-channel LSTM model for laparoscopic cholecystectomy phase recognition.基于动态数据平衡策略的Xception双通道长短期记忆模型用于腹腔镜胆囊切除术阶段识别
Int J Comput Assist Radiol Surg. 2025 Sep 7. doi: 10.1007/s11548-025-03509-8.
2
Evaluation of single-stage vision models for pose estimation of surgical instruments.评估单阶段视觉模型在手术器械位姿估计中的应用。
Int J Comput Assist Radiol Surg. 2023 Dec;18(12):2125-2142. doi: 10.1007/s11548-023-02890-6. Epub 2023 Apr 30.