Suppr超能文献

LapUNet:一种使用动态拉普拉斯残差U型网络进行单目深度估计的新方法。

LapUNet: a novel approach to monocular depth estimation using dynamic laplacian residual U-shape networks.

作者信息

Xi Yanhui, Li Sai, Xu Zhikang, Zhou Feng, Tian Juanxiu

机构信息

School of Electrical and Information Engineering, Changsha University of Science and Technology, Changsha, 410114, Hunan, China.

State Key Laboratory of Disaster Prevention & Reduction for Power Grid, Changsha University of Science & Technology, Changsha, 410114, Hunan, China.

出版信息

Sci Rep. 2024 Oct 9;14(1):23544. doi: 10.1038/s41598-024-74445-x.

Abstract

Monocular depth estimation is an important but challenging task. Although the performance has been improved by adopting various encoder-decoder architectures, the estimated depth maps lack structure details and clear edges due to simple repeated upsampling. To solve this problem, this paper presents the novel LapUNet (Laplacian U-shape networks), in which the encoder adopts ResNeXt101, and the decoder is constructed with the novel DLRU (dynamic Laplacian residual U-shape) module. The DLRU module based on the U-shape structure can supplement high-frequency features by fusing dynamic Laplacian residual into the process of upsampling, and the residual is dynamically learnable due to the addition of convolutional operation. Also, the ASPP (atrous spatial pyramid pooling) module is introduced to capture image context at multiple scales though multiple parallel atrous convolutional layers, and the depth map fusion module is used for combining high and low frequency features from depth maps with different spatial resolution. Experiments demonstrate that the proposed model with moderate model size is superior to other previous competitors on the KITTI and NYU Depth V2 datasets. Furthermore, 3D reconstruction and target ranging by utilizing the estimated depth maps prove the effectiveness of our proposed method.

摘要

单目深度估计是一项重要但具有挑战性的任务。尽管通过采用各种编码器-解码器架构提高了性能,但由于简单的重复上采样,估计的深度图缺乏结构细节和清晰的边缘。为了解决这个问题,本文提出了新颖的LapUNet(拉普拉斯U型网络),其中编码器采用ResNeXt101,解码器由新颖的DLRU(动态拉普拉斯残差U型)模块构建。基于U型结构的DLRU模块可以通过将动态拉普拉斯残差融合到上采样过程中来补充高频特征,并且由于添加了卷积操作,残差是动态可学习的。此外,引入了空洞空间金字塔池化(ASPP)模块,通过多个并行的空洞卷积层在多个尺度上捕捉图像上下文,并且深度图融合模块用于组合来自具有不同空间分辨率的深度图的高频和低频特征。实验表明,所提出的具有适度模型大小的模型在KITTI和NYU Depth V2数据集上优于其他先前的竞争对手。此外,利用估计的深度图进行3D重建和目标测距证明了我们所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/28364cfa3d6d/41598_2024_74445_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验