Li Shuguang, Yan Jiafu, Chen Haoran, Zheng Ke
School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
Sensors (Basel). 2023 Aug 31;23(17):7560. doi: 10.3390/s23177560.
Depth estimation is an important part of the perception system in autonomous driving. Current studies often reconstruct dense depth maps from RGB images and sparse depth maps obtained from other sensors. However, existing methods often pay insufficient attention to latent semantic information. Considering the highly structured characteristics of driving scenes, we propose a dual-branch network to predict dense depth maps by fusing radar and RGB images. The driving scene is divided into three parts in the proposed architecture, each predicting a depth map, which is finally merged into one by implementing the fusion strategy in order to make full use of the potential semantic information in the driving scene. In addition, a variant L1 loss function is applied in the training phase, directing the network to focus more on those areas of interest when driving. Our proposed method is evaluated on the nuScenes dataset. Experiments demonstrate its effectiveness in comparison with previous state of the art methods.
深度估计是自动驾驶感知系统的重要组成部分。当前的研究通常从RGB图像和从其他传感器获得的稀疏深度图重建密集深度图。然而,现有方法往往对潜在语义信息关注不足。考虑到驾驶场景的高度结构化特征,我们提出了一种双分支网络,通过融合雷达和RGB图像来预测密集深度图。在所提出的架构中,驾驶场景被分为三个部分,每个部分预测一个深度图,最后通过实施融合策略将其合并为一个深度图,以便充分利用驾驶场景中的潜在语义信息。此外,在训练阶段应用了一种变体L1损失函数,引导网络在驾驶时更多地关注那些感兴趣的区域。我们提出的方法在nuScenes数据集上进行了评估。实验表明,与先前的最先进方法相比,它是有效的。