• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从单目图像精确重建三维场景形状

Towards Accurate Reconstruction of 3D Scene Shape From A Single Monocular Image.

作者信息

Yin Wei, Zhang Jianming, Wang Oliver, Niklaus Simon, Chen Simon, Liu Yifan, Shen Chunhua

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):6480-6494. doi: 10.1109/TPAMI.2022.3209968. Epub 2023 Apr 3.

DOI:10.1109/TPAMI.2022.3209968
PMID:36197868
Abstract

Despite significant progress made in the past few years, challenges remain for depth estimation using a single monocular image. First, it is nontrivial to train a metric-depth prediction model that can generalize well to diverse scenes mainly due to limited training data. Thus, researchers have built large-scale relative depth datasets that are much easier to collect. However, existing relative depth estimation models often fail to recover accurate 3D scene shapes due to the unknown depth shift caused by training with the relative depth data. We tackle this problem here and attempt to estimate accurate scene shapes by training on large-scale relative depth data, and estimating the depth shift. To do so, we propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes. As the two modules are trained separately, we do not need strictly paired training data. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to improve training with relative depth annotation. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation. Code is available at: https://github.com/aim-uofa/depth/.

摘要

尽管在过去几年中取得了显著进展,但使用单目图像进行深度估计仍面临挑战。首先,训练一个能够很好地推广到各种场景的度量深度预测模型并非易事,这主要是由于训练数据有限。因此,研究人员构建了大规模的相对深度数据集,这些数据集更容易收集。然而,由于使用相对深度数据进行训练导致的未知深度偏移,现有的相对深度估计模型往往无法恢复准确的3D场景形状。我们在此解决这个问题,并尝试通过在大规模相对深度数据上进行训练并估计深度偏移来估计准确的场景形状。为此,我们提出了一个两阶段框架,该框架首先从单目图像预测出未知比例和偏移的深度,然后利用3D点云数据预测深度偏移和相机焦距,从而使我们能够恢复3D场景形状。由于这两个模块是分开训练的,我们不需要严格配对的训练数据。此外,我们提出了一种图像级归一化回归损失和一种基于法线的几何损失,以改进相对深度标注的训练。我们在九个未见数据集上测试了我们的深度模型,并在零样本评估中取得了领先的性能。代码可在以下网址获取:https://github.com/aim-uofa/depth/ 。

相似文献

1
Towards Accurate Reconstruction of 3D Scene Shape From A Single Monocular Image.从单目图像精确重建三维场景形状
IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):6480-6494. doi: 10.1109/TPAMI.2022.3209968. Epub 2023 Apr 3.
2
Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-Shot Metric Depth and Surface Normal Estimation.Metric3D v2:一种用于零样本度量深度和表面法线估计的通用单目几何基础模型。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10579-10596. doi: 10.1109/TPAMI.2024.3444912. Epub 2024 Nov 6.
3
Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction.虚拟法线:为准确且稳健的深度预测实施几何约束
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7282-7295. doi: 10.1109/TPAMI.2021.3097396. Epub 2022 Sep 14.
4
SLAM-based dense surface reconstruction in monocular Minimally Invasive Surgery and its application to Augmented Reality.基于 SLAM 的单目微创手术中密集表面重建及其在增强现实中的应用。
Comput Methods Programs Biomed. 2018 May;158:135-146. doi: 10.1016/j.cmpb.2018.02.006. Epub 2018 Feb 8.
5
Deep Learning-Based Monocular Depth Estimation Methods-A State-of-the-Art Review.基于深度学习的单目深度估计方法——最新综述。
Sensors (Basel). 2020 Apr 16;20(8):2272. doi: 10.3390/s20082272.
6
DPSNet: Multitask Learning Using Geometry Reasoning for Scene Depth and Semantics.DPSNet:利用几何推理进行场景深度和语义的多任务学习。
IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):2710-2721. doi: 10.1109/TNNLS.2021.3107362. Epub 2023 Jun 1.
7
Recovering dense 3D point clouds from single endoscopic image.从单张内窥镜图像中恢复密集三维点云。
Comput Methods Programs Biomed. 2021 Jun;205:106077. doi: 10.1016/j.cmpb.2021.106077. Epub 2021 Apr 3.
8
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer.迈向稳健的单目深度估计:混合数据集以实现零样本跨数据集迁移。
IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1623-1637. doi: 10.1109/TPAMI.2020.3019967. Epub 2022 Feb 3.
9
NeuralRecon: Real-Time Coherent 3D Scene Reconstruction From Monocular Video.神经重建:基于单目视频的实时连贯三维场景重建
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):7542-7555. doi: 10.1109/TPAMI.2024.3393141. Epub 2024 Nov 6.
10
Semi-Supervised Adversarial Monocular Depth Estimation.半监督对抗式单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2410-2422. doi: 10.1109/TPAMI.2019.2936024. Epub 2019 Aug 20.

引用本文的文献

1
UFM: Unified feature matching pre-training with multi-modal image assistants.UFM:使用多模态图像助手进行统一特征匹配预训练
PLoS One. 2025 Mar 31;20(3):e0319051. doi: 10.1371/journal.pone.0319051. eCollection 2025.