Wang Shuling, Jiang Fengze, Gong Xiaojin
The College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China.
Sensors (Basel). 2024 Sep 27;24(19):6270. doi: 10.3390/s24196270.
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information from corresponding color images obtained from camera sensors. To address these challenges, we introduce transformer models, which have shown great promise in the field of vision, into the task of image-guided depth completion. By leveraging the self-attention mechanism, we propose a novel network architecture that effectively meets these requirements of high accuracy and resolution in depth data. To be more specific, we design a dual-branch model with a transformer-based encoder that serializes image features into tokens step by step and extracts multi-scale pyramid features suitable for pixel-wise dense prediction tasks. Additionally, we incorporate a dual-attention fusion module to enhance the fusion between the two branches. This module combines convolution-based spatial and channel-attention mechanisms, which are adept at capturing local information, with cross-attention mechanisms that excel at capturing long-distance relationships. Our model achieves state-of-the-art performance on both the NYUv2 depth and SUN-RGBD depth datasets. Additionally, our ablation studies confirm the effectiveness of the designed modules.
深度信息对于感知三维场景至关重要。然而,深度传感器直接捕获的深度图往往不完整且有噪声,我们在深度补全任务中的目标是通过融合来自相机传感器获取的相应彩色图像的引导信息,从稀疏深度输入生成密集且准确的深度图。为应对这些挑战,我们将在视觉领域展现出巨大潜力的变压器模型引入图像引导的深度补全任务中。通过利用自注意力机制,我们提出了一种新颖的网络架构,该架构有效地满足了深度数据中高精度和高分辨率的这些要求。更具体地说,我们设计了一个双分支模型,其具有基于变压器的编码器,该编码器将图像特征逐步序列化到令牌中,并提取适用于逐像素密集预测任务的多尺度金字塔特征。此外,我们纳入了一个双注意力融合模块来增强两个分支之间的融合。该模块将擅长捕捉局部信息的基于卷积的空间和通道注意力机制与擅长捕捉远距离关系的交叉注意力机制相结合。我们的模型在NYUv2深度数据集和SUN-RGBD深度数据集上均取得了领先的性能。此外,我们的消融研究证实了所设计模块的有效性。