• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Surgical-DINO:内窥镜手术中深度估计的基础模型适配器学习。

Surgical-DINO: adapter learning of foundation models for depth estimation in endoscopic surgery.

机构信息

The Chinese University of Hong Kong, Hong Kong, China.

Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS), University College London, London, UK.

出版信息

Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1013-1020. doi: 10.1007/s11548-024-03083-5. Epub 2024 Mar 8.

DOI:10.1007/s11548-024-03083-5
PMID:38459402
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11178563/
Abstract

PURPOSE

Depth estimation in robotic surgery is vital in 3D reconstruction, surgical navigation and augmented reality visualization. Although the foundation model exhibits outstanding performance in many vision tasks, including depth estimation (e.g., DINOv2), recent works observed its limitations in medical and surgical domain-specific applications. This work presents a low-ranked adaptation (LoRA) of the foundation model for surgical depth estimation.

METHODS

We design a foundation model-based depth estimation method, referred to as Surgical-DINO, a low-rank adaptation of the DINOv2 for depth estimation in endoscopic surgery. We build LoRA layers and integrate them into DINO to adapt with surgery-specific domain knowledge instead of conventional fine-tuning. During training, we freeze the DINO image encoder, which shows excellent visual representation capacity, and only optimize the LoRA layers and depth decoder to integrate features from the surgical scene.

RESULTS

Our model is extensively validated on a MICCAI challenge dataset of SCARED, which is collected from da Vinci Xi endoscope surgery. We empirically show that Surgical-DINO significantly outperforms all the state-of-the-art models in endoscopic depth estimation tasks. The analysis with ablation studies has shown evidence of the remarkable effect of our LoRA layers and adaptation.

CONCLUSION

Surgical-DINO shed some light on the successful adaptation of the foundation models into the surgical domain for depth estimation. There is clear evidence in the results that zero-shot prediction on pre-trained weights in computer vision datasets or naive fine-tuning is not sufficient to use the foundation model in the surgical domain directly.

摘要

目的

机器人手术中的深度估计对于 3D 重建、手术导航和增强现实可视化至关重要。虽然基础模型在许多视觉任务中表现出色,包括深度估计(例如 DINOv2),但最近的研究发现其在医学和外科领域特定应用中的局限性。本研究提出了一种针对外科深度估计的基础模型低秩自适应(LoRA)方法。

方法

我们设计了一种基于基础模型的深度估计方法,称为 Surgical-DINO,这是 DINOv2 的低秩自适应方法,用于内窥镜手术中的深度估计。我们构建了 LoRA 层并将其集成到 DINO 中,以适应手术特定领域的知识,而不是传统的微调。在训练过程中,我们冻结 DINO 图像编码器,该编码器具有出色的视觉表示能力,仅优化 LoRA 层和深度解码器,以整合来自手术场景的特征。

结果

我们的模型在 MICCAI 挑战数据集 SCARED 上进行了广泛验证,该数据集是从达芬奇 Xi 内窥镜手术中收集的。我们的实证研究表明,Surgical-DINO 在内窥镜深度估计任务中显著优于所有最先进的模型。通过消融研究的分析,证明了我们的 LoRA 层和自适应的显著效果。

结论

Surgical-DINO 为基础模型成功适应外科领域的深度估计提供了一些启示。研究结果清楚地表明,在计算机视觉数据集上使用预训练权重进行零样本预测或简单的微调不足以直接将基础模型用于外科领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/4289e0956755/11548_2024_3083_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/ae70979913aa/11548_2024_3083_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/2bc8075d014b/11548_2024_3083_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/4289e0956755/11548_2024_3083_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/ae70979913aa/11548_2024_3083_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/2bc8075d014b/11548_2024_3083_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d658/11178563/4289e0956755/11548_2024_3083_Fig3_HTML.jpg

相似文献

1
Surgical-DINO: adapter learning of foundation models for depth estimation in endoscopic surgery.Surgical-DINO:内窥镜手术中深度估计的基础模型适配器学习。
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1013-1020. doi: 10.1007/s11548-024-03083-5. Epub 2024 Mar 8.
2
Can surgical computer vision benefit from large-scale visual foundation models?手术计算机视觉能否受益于大规模视觉基础模型?
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1157-1163. doi: 10.1007/s11548-024-03125-y. Epub 2024 Apr 12.
3
FRSR: Framework for real-time scene reconstruction in robot-assisted minimally invasive surgery.FRSR:机器人辅助微创手术中的实时场景重建框架。
Comput Biol Med. 2023 Sep;163:107121. doi: 10.1016/j.compbiomed.2023.107121. Epub 2023 Jun 3.
4
Foundation models in gastrointestinal endoscopic AI: Impact of architecture, pre-training approach and data efficiency.胃肠道内镜 AI 中的基础模型:架构、预训练方法和数据效率的影响。
Med Image Anal. 2024 Dec;98:103298. doi: 10.1016/j.media.2024.103298. Epub 2024 Aug 12.
5
OneSLAM to map them all: a generalized approach to SLAM for monocular endoscopic imaging based on tracking any point.一图胜千言:基于跟踪任意点的单目内窥镜成像的 SLAM 广义方法。
Int J Comput Assist Radiol Surg. 2024 Jul;19(7):1259-1266. doi: 10.1007/s11548-024-03171-6. Epub 2024 May 22.
6
Stereo Dense Scene Reconstruction and Accurate Localization for Learning-Based Navigation of Laparoscope in Minimally Invasive Surgery.用于微创手术中基于学习的腹腔镜导航的立体密集场景重建与精确定位
IEEE Trans Biomed Eng. 2023 Feb;70(2):488-500. doi: 10.1109/TBME.2022.3195027. Epub 2023 Jan 19.
7
StaSiS-Net: A stacked and siamese disparity estimation network for depth reconstruction in modern 3D laparoscopy.StaSiS-Net:一种用于现代三维腹腔镜深度重建的堆叠式连体视差估计网络。
Med Image Anal. 2022 Apr;77:102380. doi: 10.1016/j.media.2022.102380. Epub 2022 Jan 30.
8
MonoLoT: Self-Supervised Monocular Depth Estimation in Low-Texture Scenes for Automatic Robotic Endoscopy.MonoLoT:用于自动机器人内窥镜的低纹理场景下的自监督单目深度估计。
IEEE J Biomed Health Inform. 2024 Oct;28(10):6078-6091. doi: 10.1109/JBHI.2024.3423791. Epub 2024 Oct 3.
9
Spatio-temporal layers based intra-operative stereo depth estimation network via hierarchical prediction and progressive training.基于时空层的术中立体深度估计网络,通过分层预测和渐进式训练。
Comput Methods Programs Biomed. 2024 Feb;244:107937. doi: 10.1016/j.cmpb.2023.107937. Epub 2023 Nov 22.
10
A controlled laboratory and clinical evaluation of a three-dimensional endoscope for endonasal sinus and skull base surgery.经鼻内镜和颅底手术用三维内窥镜的对照实验室和临床评估。
Am J Rhinol Allergy. 2011 May-Jun;25(3):141-4. doi: 10.2500/ajra.2011.25.3593.

引用本文的文献

1
Postoperative outcome analysis of chronic rhinosinusitis using transfer learning with pre-trained foundation models based on endoscopic images: a multicenter, observational study.基于内镜图像使用预训练基础模型的迁移学习对慢性鼻-鼻窦炎的术后结果分析:一项多中心观察性研究
Biomed Eng Online. 2025 Jul 27;24(1):95. doi: 10.1186/s12938-025-01428-y.
2
Enhance fashion classification of mosquito vector species via self-supervised vision transformer.通过自监督视觉变换器增强蚊媒物种的分类
Sci Rep. 2024 Dec 28;14(1):31517. doi: 10.1038/s41598-024-83358-8.
3
Neural fields for 3D tracking of anatomy and surgical instruments in monocular laparoscopic video clips.

本文引用的文献

1
Unsupervised Convolutional Neural Network for Motion Estimation in Ultrasound Elastography.超声弹性成像中运动估计的无监督卷积神经网络。
IEEE Trans Ultrason Ferroelectr Freq Control. 2022 Jul;69(7):2236-2247. doi: 10.1109/TUFFC.2022.3171676. Epub 2022 Jun 30.
2
Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.内窥镜中单目深度和自我运动估计的自监督学习:外观流来救援。
Med Image Anal. 2022 Apr;77:102338. doi: 10.1016/j.media.2021.102338. Epub 2021 Dec 25.
3
EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos.
用于单目腹腔镜视频片段中解剖结构和手术器械三维跟踪的神经场
Healthc Technol Lett. 2024 Dec 12;11(6):411-417. doi: 10.1049/htl2.12113. eCollection 2024 Dec.
4
Foundational Models for Pathology and Endoscopy Images: Application for Gastric Inflammation.病理学和内窥镜图像的基础模型:在胃炎症中的应用
Diagnostics (Basel). 2024 Aug 30;14(17):1912. doi: 10.3390/diagnostics14171912.
内镜 SLAM 数据集和一种用于内镜视频的无监督单目视觉里程计和深度估计方法。
Med Image Anal. 2021 Jul;71:102058. doi: 10.1016/j.media.2021.102058. Epub 2021 Apr 15.
4
Dense Depth Estimation in Monocular Endoscopy With Self-Supervised Learning Methods.基于自监督学习方法的单目内窥镜下密集深度估计。
IEEE Trans Med Imaging. 2020 May;39(5):1438-1447. doi: 10.1109/TMI.2019.2950936. Epub 2019 Nov 1.