• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单目深度估计中神经网络结构通用性的研究

A Study on the Generality of Neural Network Structures for Monocular Depth Estimation.

作者信息

Bae Jinwoo, Hwang Kyumin, Im Sunghoon

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2224-2238. doi: 10.1109/TPAMI.2023.3332407. Epub 2024 Mar 6.

DOI:10.1109/TPAMI.2023.3332407
PMID:37956006
Abstract

Monocular depth estimation has been widely studied, and significant improvements in performance have been recently reported. However, most previous works are evaluated on a few benchmark datasets, such as KITTI datasets, and none of the works provide an in-depth analysis of the generalization performance of monocular depth estimation. In this paper, we deeply investigate the various backbone networks (e.g.CNN and Transformer models) toward the generalization of monocular depth estimation. First, we evaluate state-of-the-art models on both in-distribution and out-of-distribution datasets, which have never been seen during network training. Then, we investigate the internal properties of the representations from the intermediate layers of CNN-/Transformer-based models using synthetic texture-shifted datasets. Through extensive experiments, we observe that the Transformers exhibit a strong shape-bias rather than CNNs, which have a strong texture-bias. We also discover that texture-biased models exhibit worse generalization performance for monocular depth estimation than shape-biased models. We demonstrate that similar aspects are observed in real-world driving datasets captured under diverse environments. Lastly, we conduct a dense ablation study with various backbone networks which are utilized in modern strategies. The experiments demonstrate that the intrinsic locality of the CNNs and the self-attention of the Transformers induce texture-bias and shape-bias, respectively.

摘要

单目深度估计已经得到了广泛研究,并且最近有报道称其性能有了显著提升。然而,之前的大多数工作都是在一些基准数据集(如KITTI数据集)上进行评估的,而且没有一项工作对单目深度估计的泛化性能进行深入分析。在本文中,我们针对单目深度估计的泛化深入研究了各种骨干网络(如卷积神经网络和Transformer模型)。首先,我们在分布内和分布外数据集上评估了最先进的模型,这些数据集在网络训练期间从未见过。然后,我们使用合成纹理偏移数据集研究了基于卷积神经网络/Transformer模型中间层表示的内部属性。通过大量实验,我们观察到Transformer模型表现出强烈的形状偏差,而卷积神经网络则表现出强烈的纹理偏差。我们还发现,对于单目深度估计,有纹理偏差的模型比有形状偏差的模型泛化性能更差。我们证明,在不同环境下捕获的真实世界驾驶数据集中也观察到了类似情况。最后,我们对现代策略中使用的各种骨干网络进行了密集的消融研究。实验表明,卷积神经网络的内在局部性和Transformer模型的自注意力分别导致了纹理偏差和形状偏差。

相似文献

1
A Study on the Generality of Neural Network Structures for Monocular Depth Estimation.单目深度估计中神经网络结构通用性的研究
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2224-2238. doi: 10.1109/TPAMI.2023.3332407. Epub 2024 Mar 6.
2
RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.RT-ViT:基于轻量级视觉Transformer 的实时单目深度估计。
Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.
3
Lightweight monocular depth estimation using a fusion-improved transformer.使用融合改进型变压器的轻量级单目深度估计
Sci Rep. 2024 Sep 28;14(1):22472. doi: 10.1038/s41598-024-72682-8.
4
SFA-MDEN: Semantic-Feature-Aided Monocular Depth Estimation Network Using Dual Branches.SFA-MDEN:基于语义特征辅助的双通道单目深度估计网络。
Sensors (Basel). 2021 Aug 13;21(16):5476. doi: 10.3390/s21165476.
5
AMENet is a monocular depth estimation network designed for automatic stereoscopic display.AMENet是一种为自动立体显示而设计的单目深度估计网络。
Sci Rep. 2024 Mar 11;14(1):5868. doi: 10.1038/s41598-024-56095-1.
6
Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks.使用多尺度连续条件随机场作为序列深度网络的单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2019 Jun;41(6):1426-1440. doi: 10.1109/TPAMI.2018.2839602. Epub 2018 May 22.
7
Deep Learning-Based Monocular Depth Estimation Methods-A State-of-the-Art Review.基于深度学习的单目深度估计方法——最新综述。
Sensors (Basel). 2020 Apr 16;20(8):2272. doi: 10.3390/s20082272.
8
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers.基于拉普拉斯图像金字塔和局部平面引导层的单目深度估计
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
9
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer.迈向稳健的单目深度估计:混合数据集以实现零样本跨数据集迁移。
IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1623-1637. doi: 10.1109/TPAMI.2020.3019967. Epub 2022 Feb 3.
10
Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network.单目深度估计:轻量级卷积和矩阵胶囊特征融合网络。
Sensors (Basel). 2022 Aug 23;22(17):6344. doi: 10.3390/s22176344.

引用本文的文献

1
Human-like monocular depth biases in deep neural networks.深度神经网络中类似人类的单眼深度偏差。
PLoS Comput Biol. 2025 Aug 19;21(8):e1013020. doi: 10.1371/journal.pcbi.1013020. eCollection 2025 Aug.
2
Monocular Depth Estimation via Self-Supervised Self-Distillation.通过自监督自蒸馏进行单目深度估计
Sensors (Basel). 2024 Jun 24;24(13):4090. doi: 10.3390/s24134090.