Suppr超能文献

内窥镜中单目深度和自我运动估计的自监督学习:外观流来救援。

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.

机构信息

School of Automation Science and Electrical Engineering, Beihang University, Beijing, China.

School of Automation Science and Electrical Engineering, Beihang University, Beijing, China; Hangzhou Innovation Institute, Beihang University, Hangzhou, China.

出版信息

Med Image Anal. 2022 Apr;77:102338. doi: 10.1016/j.media.2021.102338. Epub 2021 Dec 25.

Abstract

Recently, self-supervised learning technology has been applied to calculate depth and ego-motion from monocular videos, achieving remarkable performance in autonomous driving scenarios. One widely adopted assumption of depth and ego-motion self-supervised learning is that the image brightness remains constant within nearby frames. Unfortunately, the endoscopic scene does not meet this assumption because there are severe brightness fluctuations induced by illumination variations, non-Lambertian reflections and interreflections during data collection, and these brightness fluctuations inevitably deteriorate the depth and ego-motion estimation accuracy. In this work, we introduce a novel concept referred to as appearance flow to address the brightness inconsistency problem. The appearance flow takes into consideration any variations in the brightness pattern and enables us to develop a generalized dynamic image constraint. Furthermore, we build a unified self-supervised framework to estimate monocular depth and ego-motion simultaneously in endoscopic scenes, which comprises a structure module, a motion module, an appearance module and a correspondence module, to accurately reconstruct the appearance and calibrate the image brightness. Extensive experiments are conducted on the SCARED dataset and EndoSLAM dataset, and the proposed unified framework exceeds other self-supervised approaches by a large margin. To validate our framework's generalization ability on different patients and cameras, we train our model on SCARED but test it on the SERV-CT and Hamlyn datasets without any fine-tuning, and the superior results reveal its strong generalization ability. Code is available at: https://github.com/ShuweiShao/AF-SfMLearner.

摘要

最近,自监督学习技术已被应用于从单目视频中计算深度和自身运动,在自动驾驶场景中取得了显著的性能。深度和自身运动自监督学习的一个广泛采用的假设是,图像亮度在附近帧内保持不变。不幸的是,内窥镜场景不符合这个假设,因为在数据采集过程中存在由光照变化、非朗伯反射和内反射引起的严重亮度波动,这些亮度波动不可避免地会降低深度和自身运动估计的准确性。在这项工作中,我们引入了一个新的概念,称为外观流,以解决亮度不一致的问题。外观流考虑了亮度模式的任何变化,并使我们能够开发一种广义的动态图像约束。此外,我们构建了一个统一的自监督框架,以同时估计内窥镜场景中的单目深度和自身运动,该框架包括一个结构模块、一个运动模块、一个外观模块和一个对应模块,以准确重建外观和校准图像亮度。我们在 SCARED 数据集和 EndoSLAM 数据集上进行了广泛的实验,所提出的统一框架大大超过了其他自监督方法。为了验证我们的框架在不同患者和相机上的泛化能力,我们在 SCARED 上训练模型,但在不进行任何微调的情况下在 SERV-CT 和 Hamlyn 数据集上进行测试,优越的结果表明其具有很强的泛化能力。代码可在:https://github.com/ShuweiShao/AF-SfMLearner 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验