Department of Smart Car Engineering, Chungbuk National University, 1 Chungdae-ro, Seowon-gu, Cheongju-si 28644, Republic of Korea.
Department of Intelligent Systems and Robotics, Chungbuk National University, 1 Chungdae-ro, Seowon-gu, Cheongju-si 28644, Republic of Korea.
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder-decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2.
从二维图像估计准确的深度很重要,长期以来,许多研究都致力于解决深度估计问题。最近,随着基于深度学习的单目相机图像深度估计研究的进展,使用各种技术进行准确深度估计的研究正在进行。然而,从二维图像进行深度估计一直是预测物体边界的问题。在本文中,我们旨在通过强调物体之间的精确边界来预测复杂的深度。我们提出了一种具有拉普拉斯金字塔和局部平面引导方法的编解码器结构的深度估计网络。在使用编码器对学习到的特征进行上采样的过程中,该步骤的目的是通过使用拉普拉斯金字塔和局部平面引导技术引导更复杂的物体边界,从而获得更清晰的深度图。我们使用 KITTI 和 NYU Depth V2 数据集对我们的模型进行训练和测试。该网络仅使用卷积构建 DNN,并使用 ConvNext 网络作为骨干。训练好的模型在 KITTI 数据集上的绝对相对误差 (Abs_rel) 为 0.054,均方根误差 (RMSE) 为 2.252,在 NYU Depth V2 数据集上的绝对相对误差 (Abs_rel) 为 0.102,均方根误差为 0.355。在单目深度估计的最新技术中,我们的网络性能在 KITTI Eigen 拆分中排名第五,在 NYU Depth V2 中排名第八。