• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过自监督自蒸馏进行单目深度估计

Monocular Depth Estimation via Self-Supervised Self-Distillation.

作者信息

Hu Haifeng, Feng Yuyang, Li Dapeng, Zhang Suofei, Zhao Haitao

机构信息

College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China.

College of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China.

出版信息

Sensors (Basel). 2024 Jun 24;24(13):4090. doi: 10.3390/s24134090.

DOI:10.3390/s24134090
PMID:39000869
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11243901/
Abstract

Self-supervised monocular depth estimation can exhibit excellent performance in static environments due to the multi-view consistency assumption during the training process. However, it is hard to maintain depth consistency in dynamic scenes when considering the occlusion problem caused by moving objects. For this reason, we propose a method of self-supervised self-distillation for monocular depth estimation (SS-MDE) in dynamic scenes, where a deep network with a multi-scale decoder and a lightweight pose network are designed to predict depth in a self-supervised manner via the disparity, motion information, and the association between two adjacent frames in the image sequence. Meanwhile, in order to improve the depth estimation accuracy of static areas, the pseudo-depth images generated by the LeReS network are used to provide the pseudo-supervision information, enhancing the effect of depth refinement in static areas. Furthermore, a forgetting factor is leveraged to alleviate the dependency on the pseudo-supervision. In addition, a teacher model is introduced to generate depth prior information, and a multi-view mask filter module is designed to implement feature extraction and noise filtering. This can enable the student model to better learn the deep structure of dynamic scenes, enhancing the generalization and robustness of the entire model in a self-distillation manner. Finally, on four public data datasets, the performance of the proposed SS-MDE method outperformed several state-of-the-art monocular depth estimation techniques, achieving an accuracy (δ1) of 89% while minimizing the error (AbsRel) by 0.102 in NYU-Depth V2 and achieving an accuracy (δ1) of 87% while minimizing the error (AbsRel) by 0.111 in KITTI.

摘要

由于在训练过程中采用了多视图一致性假设,自监督单目深度估计在静态环境中能够展现出优异的性能。然而,在动态场景中,考虑到移动物体引起的遮挡问题,很难保持深度一致性。因此,我们提出了一种用于动态场景单目深度估计的自监督自蒸馏方法(SS-MDE),其中设计了一个具有多尺度解码器的深度网络和一个轻量级姿态网络,以通过视差、运动信息以及图像序列中两个相邻帧之间的关联,以自监督的方式预测深度。同时,为了提高静态区域的深度估计精度,利用LeReS网络生成的伪深度图像来提供伪监督信息,增强静态区域深度细化的效果。此外,引入遗忘因子以减轻对伪监督的依赖。另外,引入教师模型来生成深度先验信息,并设计了一个多视图掩码滤波模块来进行特征提取和噪声滤波。这可以使学生模型更好地学习动态场景的深层结构,以自蒸馏的方式增强整个模型的泛化能力和鲁棒性。最后,在四个公共数据数据集上,所提出的SS-MDE方法的性能优于几种先进的单目深度估计技术,在NYU-Depth V2中,准确率(δ1)达到89%,同时将误差(AbsRel)最小化0.102;在KITTI中,准确率(δ1)达到87%,同时将误差(AbsRel)最小化0.111。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/6f2c9786aa69/sensors-24-04090-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/f5f7e07296b5/sensors-24-04090-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/45a427641fc6/sensors-24-04090-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/4743e7aa73f2/sensors-24-04090-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/8311661f50fb/sensors-24-04090-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/1a43d60a5b1f/sensors-24-04090-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/531373ac79a1/sensors-24-04090-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/ffe42d0d6e10/sensors-24-04090-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/6f2c9786aa69/sensors-24-04090-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/f5f7e07296b5/sensors-24-04090-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/45a427641fc6/sensors-24-04090-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/4743e7aa73f2/sensors-24-04090-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/8311661f50fb/sensors-24-04090-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/1a43d60a5b1f/sensors-24-04090-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/531373ac79a1/sensors-24-04090-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/ffe42d0d6e10/sensors-24-04090-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0028/11243901/6f2c9786aa69/sensors-24-04090-g008.jpg

相似文献

1
Monocular Depth Estimation via Self-Supervised Self-Distillation.通过自监督自蒸馏进行单目深度估计
Sensors (Basel). 2024 Jun 24;24(13):4090. doi: 10.3390/s24134090.
2
PMIndoor: Pose Rectified Network and Multiple Loss Functions for Self-Supervised Monocular Indoor Depth Estimation.PMIndoor:用于自监督单目室内深度估计的姿态校正网络和多重损失函数
Sensors (Basel). 2023 Oct 30;23(21):8821. doi: 10.3390/s23218821.
3
SC-DepthV3: Robust Self-Supervised Monocular Depth Estimation for Dynamic Scenes.SC-DepthV3:用于动态场景的稳健自监督单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):497-508. doi: 10.1109/TPAMI.2023.3322549. Epub 2023 Dec 5.
4
Self-Supervised Monocular Depth Estimation With Self-Perceptual Anomaly Handling.具有自感知异常处理的自监督单目深度估计
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17292-17306. doi: 10.1109/TNNLS.2023.3301711. Epub 2024 Dec 2.
5
Monocular Depth Estimation from a Fisheye Camera Based on Knowledge Distillation.基于知识蒸馏的鱼眼相机单目深度估计
Sensors (Basel). 2023 Dec 16;23(24):9866. doi: 10.3390/s23249866.
6
Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation.用于自监督深度估计的多尺度密集预测变压器的知识蒸馏
Sci Rep. 2023 Nov 2;13(1):18939. doi: 10.1038/s41598-023-46178-w.
7
Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images.用于腹腔镜图像的基于3D位移模块的自监督单目深度估计
IEEE Trans Med Robot Bionics. 2022 May;4(2):331-334. doi: 10.1109/TMRB.2022.3170206.
8
Deep Monocular Depth Estimation Based on Content and Contextual Features.基于内容和上下文特征的深度单目深度估计。
Sensors (Basel). 2023 Mar 8;23(6):2919. doi: 10.3390/s23062919.
9
Unsupervised Monocular Depth Estimation via Recursive Stereo Distillation.通过递归立体蒸馏实现无监督单目深度估计
IEEE Trans Image Process. 2021;30:4492-4504. doi: 10.1109/TIP.2021.3072215. Epub 2021 Apr 27.
10
SENSE: Self-Evolving Learning for Self-Supervised Monocular Depth Estimation.SENSE:用于自监督单目深度估计的自进化学习
IEEE Trans Image Process. 2024;33:439-450. doi: 10.1109/TIP.2023.3338053. Epub 2023 Dec 29.

本文引用的文献

1
A Study on the Generality of Neural Network Structures for Monocular Depth Estimation.单目深度估计中神经网络结构通用性的研究
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2224-2238. doi: 10.1109/TPAMI.2023.3332407. Epub 2024 Mar 6.
2
SC-DepthV3: Robust Self-Supervised Monocular Depth Estimation for Dynamic Scenes.SC-DepthV3:用于动态场景的稳健自监督单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):497-508. doi: 10.1109/TPAMI.2023.3322549. Epub 2023 Dec 5.
3
Digging Into Uncertainty-Based Pseudo-Label for Robust Stereo Matching.
深入探究基于不确定性的伪标签用于稳健立体匹配
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14301-14320. doi: 10.1109/TPAMI.2023.3300976. Epub 2023 Nov 3.
4
Auto-Rectify Network for Unsupervised Indoor Depth Estimation.自动校正网络的无监督室内深度估计。
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9802-9813. doi: 10.1109/TPAMI.2021.3136220. Epub 2022 Nov 7.
5
Deep Ordinal Regression Network for Monocular Depth Estimation.用于单目深度估计的深度序数回归网络
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2018 Jun;2018:2002-2011. doi: 10.1109/CVPR.2018.00214. Epub 2018 Dec 17.
6
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields.利用深度卷积神经场从单目图像中学习深度。
IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):2024-39. doi: 10.1109/TPAMI.2015.2505283. Epub 2015 Dec 3.
7
Image quality assessment: from error visibility to structural similarity.图像质量评估:从误差可见性到结构相似性。
IEEE Trans Image Process. 2004 Apr;13(4):600-12. doi: 10.1109/tip.2003.819861.