Suppr超能文献

通过全局和局部预测融合实现深度单目深度估计

Deep Monocular Depth Estimation via Integration of Global and Local Predictions.

作者信息

Kim Youngjung, Jung Hyungjoo, Min Dongbo, Sohn Kwanghoon

出版信息

IEEE Trans Image Process. 2018 May 15. doi: 10.1109/TIP.2018.2836318.

Abstract

Recent works on machine learning have greatly advanced the accuracy of single image depth estimation. However, the resulting depth images are still over-smoothed and perceptually unsatisfying. This paper casts depth prediction from single image as a parametric learning problem. Specifically, we propose a deep variational model that effectively integrates heterogeneous predictions from two convolutional neural networks (CNNs), named global and local networks. They have contrasting network architecture and are designed to capture depth information with complementary attributes. These intermediate outputs are then combined in the integration network based on the variational framework. By unrolling the optimization steps of Split Bregman (SB) iterations in the integration network, our model can be trained in an end-to-end manner. This enables one to simultaneously learn an efficient parameterization of the CNNs and hyper-parameter in the variational method. Finally, we offer a new dataset of 0.22 million RGB-D images captured by Microsoft Kinect v2. Our model generates realistic and discontinuity-preserving depth prediction without involving any low-level segmentation or superpixels. Intensive experiments demonstrate the superiority of the proposed method in a range of RGB-D benchmarks including both indoor and outdoor scenarios.

摘要

近期关于机器学习的研究极大地提高了单图像深度估计的准确性。然而,生成的深度图像仍然过度平滑,在感知上不能令人满意。本文将单图像深度预测视为一个参数学习问题。具体而言,我们提出了一种深度变分模型,该模型有效地整合了来自两个卷积神经网络(CNN)(即全局网络和局部网络)的异构预测。它们具有截然不同的网络架构,并旨在通过互补属性来捕捉深度信息。然后,这些中间输出基于变分框架在整合网络中进行组合。通过在整合网络中展开分裂布雷格曼(SB)迭代的优化步骤,我们的模型可以以端到端的方式进行训练。这使得人们能够同时学习CNN的有效参数化以及变分方法中的超参数。最后,我们提供了一个由微软Kinect v2捕获的22万张RGB-D图像的新数据集。我们的模型生成逼真且保留不连续性的深度预测,而无需涉及任何低级分割或超像素。大量实验证明了该方法在包括室内和室外场景在内的一系列RGB-D基准测试中的优越性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验