Xu Jiayue, Zhao Jianping, Li Hua, Han Cheng, Xu Chao
School of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China.
Sensors (Basel). 2023 Nov 16;23(22):9218. doi: 10.3390/s23229218.
Monocular panoramic depth estimation has various applications in robotics and autonomous driving due to its ability to perceive the entire field of view. However, panoramic depth estimation faces two significant challenges: global context capturing and distortion awareness. In this paper, we propose a new framework for panoramic depth estimation that can simultaneously address panoramic distortion and extract global context information, thereby improving the performance of panoramic depth estimation. Specifically, we introduce an attention mechanism into the multi-scale dilated convolution and adaptively adjust the receptive field size between different spatial positions, designing the adaptive attention dilated convolution module, which effectively perceives distortion. At the same time, we design the global scene understanding module to integrate global context information into the feature maps generated using the feature extractor. Finally, we trained and evaluated our model on three benchmark datasets which contains the virtual and real-world RGB-D panorama datasets. The experimental results show that the proposed method achieves competitive performance, comparable to existing techniques in both quantitative and qualitative evaluations. Furthermore, our method has fewer parameters and more flexibility, making it a scalable solution in mobile AR.
单目全景深度估计因其能够感知整个视野而在机器人技术和自动驾驶中有着广泛应用。然而,全景深度估计面临两个重大挑战:全局上下文捕捉和畸变感知。在本文中,我们提出了一种用于全景深度估计的新框架,该框架能够同时解决全景畸变问题并提取全局上下文信息,从而提高全景深度估计的性能。具体而言,我们将注意力机制引入多尺度扩张卷积,并在不同空间位置之间自适应调整感受野大小,设计了自适应注意力扩张卷积模块,该模块能有效感知畸变。同时,我们设计了全局场景理解模块,将全局上下文信息整合到使用特征提取器生成的特征图中。最后,我们在三个包含虚拟和真实世界RGB-D全景数据集的基准数据集上对我们的模型进行了训练和评估。实验结果表明,所提出的方法在定量和定性评估中均取得了具有竞争力的性能,与现有技术相当。此外,我们的方法参数更少且更具灵活性,使其成为移动增强现实中一种可扩展的解决方案。