内窥镜中单目深度和自我运动估计的自监督学习：外观流来救援。

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.

机构信息

School of Automation Science and Electrical Engineering, Beihang University, Beijing, China.

School of Automation Science and Electrical Engineering, Beihang University, Beijing, China; Hangzhou Innovation Institute, Beihang University, Hangzhou, China.

出版信息

Med Image Anal. 2022 Apr;77:102338. doi: 10.1016/j.media.2021.102338. Epub 2021 Dec 25.

DOI:10.1016/j.media.2021.102338

PMID:35016079

Abstract

Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/ShuweiShao/AF-SfMLearner.

摘要

最近，自监督学习技术已被应用于从单目视频中计算深度和自身运动，在自动驾驶场景中取得了显著的性能。深度和自身运动自监督学习的一个广泛采用的假设是，图像亮度在附近帧内保持不变。不幸的是，内窥镜场景不符合这个假设，因为在数据采集过程中存在由光照变化、非朗伯反射和内反射引起的严重亮度波动，这些亮度波动不可避免地会降低深度和自身运动估计的准确性。在这项工作中，我们引入了一个新的概念，称为外观流，以解决亮度不一致的问题。外观流考虑了亮度模式的任何变化，并使我们能够开发一种广义的动态图像约束。此外，我们构建了一个统一的自监督框架，以同时估计内窥镜场景中的单目深度和自身运动，该框架包括一个结构模块、一个运动模块、一个外观模块和一个对应模块，以准确重建外观和校准图像亮度。我们在 SCARED 数据集和 EndoSLAM 数据集上进行了广泛的实验，所提出的统一框架大大超过了其他自监督方法。为了验证我们的框架在不同患者和相机上的泛化能力，我们在 SCARED 上训练模型，但在不进行任何微调的情况下在 SERV-CT 和 Hamlyn 数据集上进行测试，优越的结果表明其具有很强的泛化能力。代码可在：https://github.com/ShuweiShao/AF-SfMLearner 上获得。

相似文献

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.内窥镜中单目深度和自我运动估计的自监督学习：外观流来救援。

Med Image Anal. 2022 Apr;77:102338. doi: 10.1016/j.media.2021.102338. Epub 2021 Dec 25.

Self-supervised monocular depth estimation for gastrointestinal endoscopy.基于单目视觉的胃肠道内窥镜自我监督深度估计

Comput Methods Programs Biomed. 2023 Aug;238:107619. doi: 10.1016/j.cmpb.2023.107619. Epub 2023 May 19.

Self-supervised neural network-based endoscopic monocular 3D reconstruction method.基于自监督神经网络的内镜单目三维重建方法

Health Inf Sci Syst. 2023 Dec 11;12(1):4. doi: 10.1007/s13755-023-00262-7. eCollection 2024 Dec.

EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos.内镜 SLAM 数据集和一种用于内镜视频的无监督单目视觉里程计和深度估计方法。

Med Image Anal. 2021 Jul;71:102058. doi: 10.1016/j.media.2021.102058. Epub 2021 Apr 15.

Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images.用于腹腔镜图像的基于3D位移模块的自监督单目深度估计

IEEE Trans Med Robot Bionics. 2022 May;4(2):331-334. doi: 10.1109/TMRB.2022.3170206.

Self-Supervised Monocular Depth Estimation for Endoscopic Imaging.用于内镜成像的自监督单目深度估计

IEEE J Biomed Health Inform. 2024 Jul 29;PP. doi: 10.1109/JBHI.2024.3434372.

Adversarial Learning for Joint Optimization of Depth and Ego-Motion.用于深度和自我运动联合优化的对抗学习

IEEE Trans Image Process. 2020 Jan 28. doi: 10.1109/TIP.2020.2968751.

Image Intrinsic-Based Unsupervised Monocular Depth Estimation in Endoscopy.基于图像内在特征的内窥镜无监督单目深度估计

IEEE J Biomed Health Inform. 2024 May 14;PP. doi: 10.1109/JBHI.2024.3400804.

Unsupervised Learning of Monocular Depth and Ego-Motion with Optical Flow Features and Multiple Constraints.基于光流特征和多种约束的单目深度和自身运动的无监督学习。

Sensors (Basel). 2022 Feb 11;22(4):1383. doi: 10.3390/s22041383.

Monocular Depth Estimation via Self-Supervised Self-Distillation.通过自监督自蒸馏进行单目深度估计

Sensors (Basel). 2024 Jun 24;24(13):4090. doi: 10.3390/s24134090.

引用本文的文献

SHADeS: self-supervised monocular depth estimation through non-Lambertian image decomposition.SHADeS：通过非朗伯体图像分解实现自监督单目深度估计

Int J Comput Assist Radiol Surg. 2025 Jun;20(6):1255-1263. doi: 10.1007/s11548-025-03371-8. Epub 2025 May 13.

WS-SfMLearner: self-supervised monocular depth and ego-motion estimation on surgical videos with unknown camera parameters.WS-SfMLearner：在相机参数未知的手术视频上进行自监督单目深度和自我运动估计。

J Med Imaging (Bellingham). 2025 Mar;12(2):025003. doi: 10.1117/1.JMI.12.2.025003. Epub 2025 Apr 30.

Enhanced self-supervised monocular depth estimation with self-attention and joint depth-pose loss for laparoscopic images.基于自注意力机制和联合深度-姿态损失的腹腔镜图像增强自监督单目深度估计

Int J Comput Assist Radiol Surg. 2025 Apr;20(4):775-785. doi: 10.1007/s11548-025-03332-1. Epub 2025 Feb 28.

SfMDiffusion: self-supervised monocular depth estimation in endoscopy based on diffusion models.SfMDiffusion：基于扩散模型的内窥镜自监督单目深度估计

Int J Comput Assist Radiol Surg. 2025 May;20(5):971-979. doi: 10.1007/s11548-025-03333-0. Epub 2025 Feb 24.

Neural fields for 3D tracking of anatomy and surgical instruments in monocular laparoscopic video clips.用于单目腹腔镜视频片段中解剖结构和手术器械三维跟踪的神经场

Healthc Technol Lett. 2024 Dec 12;11(6):411-417. doi: 10.1049/htl2.12113. eCollection 2024 Dec.

Deep learning-assisted 3D laser steering using an optofluidic laser scanner.使用光流控激光扫描仪的深度学习辅助三维激光转向

Biomed Opt Express. 2024 Feb 15;15(3):1668-1681. doi: 10.1364/BOE.514489. eCollection 2024 Mar 1.

Improving image classification of gastrointestinal endoscopy using curriculum self-supervised learning.利用课程自监督学习提高胃肠道内镜图像分类。

Sci Rep. 2024 Mar 13;14(1):6100. doi: 10.1038/s41598-024-53955-8.

Surgical-DINO: adapter learning of foundation models for depth estimation in endoscopic surgery.Surgical-DINO：内窥镜手术中深度估计的基础模型适配器学习。

Int J Comput Assist Radiol Surg. 2024 Jun;19(6):1013-1020. doi: 10.1007/s11548-024-03083-5. Epub 2024 Mar 8.

Pose estimation via structure-depth information from monocular endoscopy images sequence.通过单目内窥镜图像序列的结构深度信息进行姿态估计。

Biomed Opt Express. 2023 Dec 22;15(1):460-478. doi: 10.1364/BOE.498262. eCollection 2024 Jan 1.

Self-supervised neural network-based endoscopic monocular 3D reconstruction method.基于自监督神经网络的内镜单目三维重建方法

Health Inf Sci Syst. 2023 Dec 11;12(1):4. doi: 10.1007/s13755-023-00262-7. eCollection 2024 Dec.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

内窥镜中单目深度和自我运动估计的自监督学习：外观流来救援。

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献