• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LapUNet:一种使用动态拉普拉斯残差U型网络进行单目深度估计的新方法。

LapUNet: a novel approach to monocular depth estimation using dynamic laplacian residual U-shape networks.

作者信息

Xi Yanhui, Li Sai, Xu Zhikang, Zhou Feng, Tian Juanxiu

机构信息

School of Electrical and Information Engineering, Changsha University of Science and Technology, Changsha, 410114, Hunan, China.

State Key Laboratory of Disaster Prevention & Reduction for Power Grid, Changsha University of Science & Technology, Changsha, 410114, Hunan, China.

出版信息

Sci Rep. 2024 Oct 9;14(1):23544. doi: 10.1038/s41598-024-74445-x.

DOI:10.1038/s41598-024-74445-x
PMID:39384933
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11464870/
Abstract

Monocular depth estimation is an important but challenging task. Although the performance has been improved by adopting various encoder-decoder architectures, the estimated depth maps lack structure details and clear edges due to simple repeated upsampling. To solve this problem, this paper presents the novel LapUNet (Laplacian U-shape networks), in which the encoder adopts ResNeXt101, and the decoder is constructed with the novel DLRU (dynamic Laplacian residual U-shape) module. The DLRU module based on the U-shape structure can supplement high-frequency features by fusing dynamic Laplacian residual into the process of upsampling, and the residual is dynamically learnable due to the addition of convolutional operation. Also, the ASPP (atrous spatial pyramid pooling) module is introduced to capture image context at multiple scales though multiple parallel atrous convolutional layers, and the depth map fusion module is used for combining high and low frequency features from depth maps with different spatial resolution. Experiments demonstrate that the proposed model with moderate model size is superior to other previous competitors on the KITTI and NYU Depth V2 datasets. Furthermore, 3D reconstruction and target ranging by utilizing the estimated depth maps prove the effectiveness of our proposed method.

摘要

单目深度估计是一项重要但具有挑战性的任务。尽管通过采用各种编码器-解码器架构提高了性能,但由于简单的重复上采样,估计的深度图缺乏结构细节和清晰的边缘。为了解决这个问题,本文提出了新颖的LapUNet(拉普拉斯U型网络),其中编码器采用ResNeXt101,解码器由新颖的DLRU(动态拉普拉斯残差U型)模块构建。基于U型结构的DLRU模块可以通过将动态拉普拉斯残差融合到上采样过程中来补充高频特征,并且由于添加了卷积操作,残差是动态可学习的。此外,引入了空洞空间金字塔池化(ASPP)模块,通过多个并行的空洞卷积层在多个尺度上捕捉图像上下文,并且深度图融合模块用于组合来自具有不同空间分辨率的深度图的高频和低频特征。实验表明,所提出的具有适度模型大小的模型在KITTI和NYU Depth V2数据集上优于其他先前的竞争对手。此外,利用估计的深度图进行3D重建和目标测距证明了我们所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/8a587e7296e9/41598_2024_74445_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/28364cfa3d6d/41598_2024_74445_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/4983772ce5d7/41598_2024_74445_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/cf765f932410/41598_2024_74445_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/b4b78a9155c6/41598_2024_74445_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/e120d5558938/41598_2024_74445_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/20b5f56bc6d3/41598_2024_74445_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/d54d7655b657/41598_2024_74445_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/6e05a2d93977/41598_2024_74445_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/e4359d90c22b/41598_2024_74445_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/8a587e7296e9/41598_2024_74445_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/28364cfa3d6d/41598_2024_74445_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/4983772ce5d7/41598_2024_74445_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/cf765f932410/41598_2024_74445_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/b4b78a9155c6/41598_2024_74445_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/e120d5558938/41598_2024_74445_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/20b5f56bc6d3/41598_2024_74445_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/d54d7655b657/41598_2024_74445_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/6e05a2d93977/41598_2024_74445_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/e4359d90c22b/41598_2024_74445_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad06/11464870/8a587e7296e9/41598_2024_74445_Fig10_HTML.jpg

相似文献

1
LapUNet: a novel approach to monocular depth estimation using dynamic laplacian residual U-shape networks.LapUNet:一种使用动态拉普拉斯残差U型网络进行单目深度估计的新方法。
Sci Rep. 2024 Oct 9;14(1):23544. doi: 10.1038/s41598-024-74445-x.
2
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers.基于拉普拉斯图像金字塔和局部平面引导层的单目深度估计
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
3
Laplacian Pyramid Neural Network for Dense Continuous-Value Regression for Complex Scenes.用于复杂场景密集连续值回归的拉普拉斯金字塔神经网络。
IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):5034-5046. doi: 10.1109/TNNLS.2020.3026669. Epub 2021 Oct 27.
4
A Novel Method for Monocular Depth Estimation Using an Hourglass Neck Module.一种使用沙漏颈部模块进行单目深度估计的新方法。
Sensors (Basel). 2024 Feb 18;24(4):1312. doi: 10.3390/s24041312.
5
A multiple-channel and atrous convolution network for ultrasound image segmentation.一种用于超声图像分割的多通道多孔卷积网络。
Med Phys. 2020 Dec;47(12):6270-6285. doi: 10.1002/mp.14512. Epub 2020 Oct 18.
6
DCPNet: A Densely Connected Pyramid Network for Monocular Depth Estimation.DCPNet:用于单目深度估计的密集连接金字塔网络。
Sensors (Basel). 2021 Oct 13;21(20):6780. doi: 10.3390/s21206780.
7
RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.RT-ViT:基于轻量级视觉Transformer 的实时单目深度估计。
Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.
8
SAR-U-Net: Squeeze-and-excitation block and atrous spatial pyramid pooling based residual U-Net for automatic liver segmentation in Computed Tomography.SAR-U-Net:基于挤压激励模块和空洞空间金字塔池化的残差 U-Net 用于 CT 肝脏自动分割。
Comput Methods Programs Biomed. 2021 Sep;208:106268. doi: 10.1016/j.cmpb.2021.106268. Epub 2021 Jul 6.
9
High quality monocular depth estimation with parallel decoder.高质量单目深度估计的并行解码器。
Sci Rep. 2022 Oct 5;12(1):16616. doi: 10.1038/s41598-022-20909-x.
10
Index Networks.索引网络
IEEE Trans Pattern Anal Mach Intell. 2022 Jan;44(1):242-255. doi: 10.1109/TPAMI.2020.3004474. Epub 2021 Dec 7.

本文引用的文献

1
A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures.递归神经网络综述:长短期记忆细胞和网络架构。
Neural Comput. 2019 Jul;31(7):1235-1270. doi: 10.1162/neco_a_01199. Epub 2019 May 21.
2
Automatic Depth Extraction from 2D Images Using a Cluster-Based Learning Framework.基于聚类学习框架的二维图像自动深度提取。
IEEE Trans Image Process. 2018 Jul;27(7):3288-3299. doi: 10.1109/TIP.2018.2813093.
3
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields.利用深度卷积神经场从单目图像中学习深度。
IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):2024-39. doi: 10.1109/TPAMI.2015.2505283. Epub 2015 Dec 3.