Electrical and Computer Engineering Department, Western Michigan University, Kalamazoo, MI 49008, USA.
Civil and Construction Engineering Department, Western Michigan University, Kalamazoo, MI 49008, USA.
Sensors (Basel). 2022 Dec 7;22(24):9578. doi: 10.3390/s22249578.
Pixel-level depth information is crucial to many applications, such as autonomous driving, robotics navigation, 3D scene reconstruction, and augmented reality. However, depth information, which is usually acquired by sensors such as LiDAR, is sparse. Depth completion is a process that predicts missing pixels' depth information from a set of sparse depth measurements. Most of the ongoing research applies deep neural networks on the entire sparse depth map and camera scene without utilizing any information about the available objects, which results in more complex and resource-demanding networks. In this work, we propose to use image instance segmentation to detect objects of interest with pixel-level locations, along with sparse depth data, to support depth completion. The framework utilizes a two-branch encoder-decoder deep neural network. It fuses information about scene available objects, such as objects' type and pixel-level location, LiDAR, and RGB camera, to predict dense accurate depth maps. Experimental results on the KITTI dataset showed faster training and improved prediction accuracy. The proposed method reaches a convergence state faster and surpasses the baseline model in all evaluation metrics.
像素级深度信息对于许多应用至关重要,例如自动驾驶、机器人导航、3D 场景重建和增强现实。然而,深度信息通常是由 LiDAR 等传感器获取的,因此是稀疏的。深度补全是一个从一组稀疏深度测量值中预测缺失像素的深度信息的过程。目前的大多数研究都是在整个稀疏深度图和摄像机场景上应用深度神经网络,而没有利用任何关于可用对象的信息,这导致了更复杂和资源密集型的网络。在这项工作中,我们提出使用图像实例分割来检测具有像素级位置的感兴趣对象,以及稀疏深度数据,以支持深度补全。该框架利用了一个两分支的编解码器深度神经网络。它融合了有关场景中可用对象的信息,例如对象的类型和像素级位置、LiDAR 和 RGB 摄像机,以预测密集准确的深度图。在 KITTI 数据集上的实验结果表明,该方法的训练速度更快,预测精度更高。所提出的方法更快地达到收敛状态,并在所有评估指标上都超过了基线模型。