Suppr超能文献

用于激光雷达深度补全的距离变换池化神经网络。

Distance Transform Pooling Neural Network for LiDAR Depth Completion.

作者信息

Zhao Yiming, Elhousni Mahdi, Zhang Ziming, Huang Xinming

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5580-5589. doi: 10.1109/TNNLS.2021.3129801. Epub 2023 Sep 1.

Abstract

Recovering dense depth maps from sparse depth sensors, such as LiDAR, is a recently proposed task with many computer vision and robotics applications. Previous works have identified input sparsity as the key challenge of this task. To solve the sparsity challenge, we propose a recurrent distance transform pooling (DTP) module that aggregates multi-level nearby information prior to the backbone neural network. The intuition of this module is originated from the observation that most pixels within the receptive field of the network are zero. This indicates a deep and heavy network structure has to be used to enlarge the receptive field aiming at capturing enough useful information as most processed signals are uninformative zeros. Our recurrent DTP module can fill in empty pixels with the nearest value in a local patch and recurrently transform distance to reach farther nearest points. The output of the proposed DTP module is a collection of multi-level semi-dense depth maps from original sparse to almost full. Processing this collection of semi-dense depth maps alleviates the network from the input sparsity, which helps a lightweight simplified ResNet-18 with 1M parameters achieve state-of-the-art performance on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) depth completion benchmark with LiDAR only. Besides the sparsity, the input LiDAR map also contains some incorrect values due to the sensor error. Thus, we further enhance the DTP with an error correction (EC) module to avoid the spreading of the incorrect input values. At last, we discuss the benefit of only using LiDAR for nighttime driving and the potential extension of the proposed method for sensor fusion and the indoor scenario. The code has been released online at https://github.com/placeforyiming/DistanceTransform-DepthCompletion.

摘要

从稀疏深度传感器(如激光雷达)恢复密集深度图是最近提出的一项任务,在许多计算机视觉和机器人应用中都有应用。先前的工作已将输入稀疏性确定为该任务的关键挑战。为了解决稀疏性挑战,我们提出了一种循环距离变换池化(DTP)模块,该模块在主干神经网络之前聚合多级附近信息。该模块的直观想法源于这样的观察:网络感受野内的大多数像素为零。这表明必须使用深层且复杂的网络结构来扩大感受野,以便在大多数处理信号都是无信息的零值的情况下捕获足够的有用信息。我们的循环DTP模块可以用局部补丁中的最近值填充空像素,并循环变换距离以到达更远的最近点。所提出的DTP模块的输出是从原始稀疏到几乎完整的多级半密集深度图的集合。处理这个半密集深度图集合减轻了网络的输入稀疏性,这有助于一个只有100万个参数的轻量级简化ResNet - 18在仅使用激光雷达的卡尔斯鲁厄理工学院和丰田技术学院(KITTI)深度完成基准测试中达到当前最优性能。除了稀疏性,由于传感器误差,输入的激光雷达图还包含一些错误值。因此,我们进一步用一个纠错(EC)模块增强DTP,以避免错误输入值的传播。最后,我们讨论了仅使用激光雷达进行夜间驾驶的好处以及所提出方法在传感器融合和室内场景方面的潜在扩展。代码已在https://github.com/placeforyiming/DistanceTransform-DepthCompletion上在线发布。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验