• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

手术计算机视觉能否受益于大规模视觉基础模型?

Can surgical computer vision benefit from large-scale visual foundation models?

机构信息

DIA2M, DRCI, CHU Clermont-Ferrand, Clermont-Ferrand, France.

出版信息

Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1157-1163. doi: 10.1007/s11548-024-03125-y. Epub 2024 Apr 12.

DOI:10.1007/s11548-024-03125-y
PMID:38609735
Abstract

PURPOSE

We investigate whether foundation models pretrained on diverse visual data could be beneficial to surgical computer vision. We use instrument and uterus segmentation in mini-invasive procedures as benchmarks. We propose multiple supervised, unsupervised and few-shot supervised adaptations of foundation models, including two novel adaptation methods.

METHODS

We use DINOv1, DINOv2, DINOv2 with registers, and SAM backbones, with the ART-Net surgical instrument and the SurgAI3.8K uterus segmentation datasets. We investigate five approaches: DINO unsupervised, few-shot learning with a linear decoder, supervised learning with the proposed DINO-UNet adaptation, DPT with DINO encoder, and unsupervised learning with the proposed SAM adaptation.

RESULTS

We evaluate 17 models for instrument segmentation and 7 models for uterus segmentation and compare to existing ad hoc models for the tasks at hand. We show that the linear decoder can be learned with few shots. The unsupervised and linear decoder methods obtain slightly subpar results but could be considered useful in data scarcity settings. The unsupervised SAM model produces finer edges but has inconsistent outputs. However, DPT and DINO-UNet obtain strikingly good results, defining a new state of the art by outperforming the previous-best by 5.6 and 4.1 pp for instrument and 4.4 and 1.5 pp for uterus segmentation. Both methods obtain semantic and spatial precision, accurately segmenting intricate details.

CONCLUSION

Our results show the huge potential of using DINO and SAM for surgical computer vision, indicating a promising role for visual foundation models in medical image analysis, particularly in scenarios with limited or complex data.

摘要

目的

我们研究了在多样化视觉数据上预训练的基础模型是否对手术计算机视觉有益。我们以微创手术中的器械和子宫分割作为基准。我们提出了多种监督、无监督和少量监督的基础模型适应方法,包括两种新的适应方法。

方法

我们使用 DINOv1、DINOv2、带有注册的 DINOv2 和 SAM 骨干,以及 ART-Net 手术器械和 SurgAI3.8K 子宫分割数据集。我们研究了五种方法:DINO 无监督、带有线性解码器的少量学习、使用提出的 DINO-UNet 适应的监督学习、带有 DINO 编码器的 DPT 和带有提出的 SAM 适应的无监督学习。

结果

我们评估了 17 个用于器械分割的模型和 7 个用于子宫分割的模型,并与现有针对手头任务的特定模型进行了比较。我们表明,可以通过少量样本学习线性解码器。无监督和线性解码器方法的结果略差,但在数据稀缺的情况下可能被认为是有用的。无监督的 SAM 模型产生了更精细的边缘,但输出不一致。然而,DPT 和 DINO-UNet 获得了惊人的好结果,在器械分割方面分别比之前的最佳方法提高了 5.6 和 4.1 个百分点,在子宫分割方面提高了 4.4 和 1.5 个百分点。这两种方法都获得了语义和空间精度,能够准确地分割复杂的细节。

结论

我们的结果表明,使用 DINO 和 SAM 进行手术计算机视觉具有巨大的潜力,表明视觉基础模型在医学图像分析中具有广阔的应用前景,特别是在数据有限或复杂的情况下。

相似文献

1
Can surgical computer vision benefit from large-scale visual foundation models?手术计算机视觉能否受益于大规模视觉基础模型?
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1157-1163. doi: 10.1007/s11548-024-03125-y. Epub 2024 Apr 12.
2
Surgical-DINO: adapter learning of foundation models for depth estimation in endoscopic surgery.Surgical-DINO:内窥镜手术中深度估计的基础模型适配器学习。
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1013-1020. doi: 10.1007/s11548-024-03083-5. Epub 2024 Mar 8.
3
Foundation models in gastrointestinal endoscopic AI: Impact of architecture, pre-training approach and data efficiency.胃肠道内镜 AI 中的基础模型:架构、预训练方法和数据效率的影响。
Med Image Anal. 2024 Dec;98:103298. doi: 10.1016/j.media.2024.103298. Epub 2024 Aug 12.
4
N-Net: an UNet architecture with dual encoder for medical image segmentation.N-Net:一种用于医学图像分割的具有双编码器的U-Net架构。
Signal Image Video Process. 2023 Mar 22:1-9. doi: 10.1007/s11760-023-02528-9.
5
Surgical-DeSAM: decoupling SAM for instrument segmentation in robotic surgery.手术去 SAM:在机器人手术中分离 SAM 进行器械分割。
Int J Comput Assist Radiol Surg. 2024 Jul;19(7):1267-1271. doi: 10.1007/s11548-024-03163-6. Epub 2024 May 17.
6
ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.ETUNet:探索高效的基于Transformer 的增强型 UNet 进行 3D 脑肿瘤分割。
Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.
7
SurgNet: Self-Supervised Pretraining With Semantic Consistency for Vessel and Instrument Segmentation in Surgical Images.SurgNet:用于手术图像中血管和器械分割的具有语义一致性的自监督预训练
IEEE Trans Med Imaging. 2024 Apr;43(4):1513-1525. doi: 10.1109/TMI.2023.3341948. Epub 2024 Apr 3.
8
FUN-SIS: A Fully UNsupervised approach for Surgical Instrument Segmentation.FUN-SIS:一种用于手术器械分割的完全无监督方法。
Med Image Anal. 2023 Apr;85:102751. doi: 10.1016/j.media.2023.102751. Epub 2023 Jan 20.
9
U-shaped GAN for Semi-Supervised Learning and Unsupervised Domain Adaptation in High Resolution Chest Radiograph Segmentation.用于高分辨率胸部X光片分割的半监督学习和无监督域适应的U形生成对抗网络
Front Med (Lausanne). 2022 Jan 13;8:782664. doi: 10.3389/fmed.2021.782664. eCollection 2021.
10
S-CUDA: Self-cleansing unsupervised domain adaptation for medical image segmentation.S-CUDA:用于医学图像分割的自清洁无监督域适应
Med Image Anal. 2021 Dec;74:102214. doi: 10.1016/j.media.2021.102214. Epub 2021 Aug 12.

本文引用的文献

1
Detection, segmentation, and 3D pose estimation of surgical tools using convolutional neural networks and algebraic geometry.使用卷积神经网络和代数几何进行手术工具的检测、分割和三维姿态估计。
Med Image Anal. 2021 May;70:101994. doi: 10.1016/j.media.2021.101994. Epub 2021 Feb 7.