• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用潜在空间特征实现精确深度估计的深度神经网络。

Deep Neural Networks for Accurate Depth Estimation with Latent Space Features.

作者信息

Yasir Siddiqui Muhammad, Ahn Hyunsik

机构信息

Department of Mechanical System Engineering, Tongmyong University, Busan 48520, Republic of Korea.

School of Artificial Intelligence, Tongmyong University, Busan 48520, Republic of Korea.

出版信息

Biomimetics (Basel). 2024 Dec 9;9(12):747. doi: 10.3390/biomimetics9120747.

DOI:10.3390/biomimetics9120747
PMID:39727751
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11673802/
Abstract

Depth estimation plays a pivotal role in advancing human-robot interactions, especially in indoor environments where accurate 3D scene reconstruction is essential for tasks like navigation and object handling. Monocular depth estimation, which relies on a single RGB camera, offers a more affordable solution compared to traditional methods that use stereo cameras or LiDAR. However, despite recent progress, many monocular approaches struggle with accurately defining depth boundaries, leading to less precise reconstructions. In response to these challenges, this study introduces a novel depth estimation framework that leverages latent space features within a deep convolutional neural network to enhance the precision of monocular depth maps. The proposed model features dual encoder-decoder architecture, enabling both color-to-depth and depth-to-depth transformations. This structure allows for refined depth estimation through latent space encoding. To further improve the accuracy of depth boundaries and local features, a new loss function is introduced. This function combines latent loss with gradient loss, helping the model maintain the integrity of depth boundaries. The framework is thoroughly tested using the NYU Depth V2 dataset, where it sets a new benchmark, particularly excelling in complex indoor scenarios. The results clearly show that this approach effectively reduces depth ambiguities and blurring, making it a promising solution for applications in human-robot interaction and 3D scene reconstruction.

摘要

深度估计在推动人机交互方面起着关键作用,尤其是在室内环境中,准确的三维场景重建对于导航和物体操作等任务至关重要。单目深度估计依赖于单个RGB相机,与使用立体相机或激光雷达的传统方法相比,提供了一种更经济实惠的解决方案。然而,尽管最近取得了进展,但许多单目方法在准确定义深度边界方面仍存在困难,导致重建精度较低。针对这些挑战,本研究引入了一种新颖的深度估计框架,该框架利用深度卷积神经网络中的潜在空间特征来提高单目深度图的精度。所提出的模型具有双编码器-解码器架构,能够实现颜色到深度和深度到深度的转换。这种结构允许通过潜在空间编码进行精细的深度估计。为了进一步提高深度边界和局部特征的准确性,引入了一种新的损失函数。该函数将潜在损失与梯度损失相结合,帮助模型保持深度边界的完整性。该框架使用NYU Depth V2数据集进行了全面测试,在该数据集中它设定了一个新的基准,尤其在复杂的室内场景中表现出色。结果清楚地表明,这种方法有效地减少了深度模糊和模糊现象,使其成为人机交互和三维场景重建应用中有前景的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/e8e165cecc6d/biomimetics-09-00747-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/da98faad0920/biomimetics-09-00747-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/d3291c26eb0a/biomimetics-09-00747-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/176c7a254313/biomimetics-09-00747-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/61e6641d1701/biomimetics-09-00747-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/e8e165cecc6d/biomimetics-09-00747-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/da98faad0920/biomimetics-09-00747-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/d3291c26eb0a/biomimetics-09-00747-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/176c7a254313/biomimetics-09-00747-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/61e6641d1701/biomimetics-09-00747-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2588/11673802/e8e165cecc6d/biomimetics-09-00747-g005.jpg

相似文献

1
Deep Neural Networks for Accurate Depth Estimation with Latent Space Features.利用潜在空间特征实现精确深度估计的深度神经网络。
Biomimetics (Basel). 2024 Dec 9;9(12):747. doi: 10.3390/biomimetics9120747.
2
Residual Vision Transformer and Adaptive Fusion Autoencoders for Monocular Depth Estimation.用于单目深度估计的残差视觉Transformer和自适应融合自动编码器
Sensors (Basel). 2024 Dec 26;25(1):80. doi: 10.3390/s25010080.
3
Deep Monocular Depth Estimation Based on Content and Contextual Features.基于内容和上下文特征的深度单目深度估计。
Sensors (Basel). 2023 Mar 8;23(6):2919. doi: 10.3390/s23062919.
4
RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.RT-ViT:基于轻量级视觉Transformer 的实时单目深度估计。
Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.
5
Superb Monocular Depth Estimation Based on Transfer Learning and Surface Normal Guidance.基于迁移学习和表面法向导引的卓越单目深度估计。
Sensors (Basel). 2020 Aug 27;20(17):4856. doi: 10.3390/s20174856.
6
Deep Learning-Based Monocular Depth Estimation Methods-A State-of-the-Art Review.基于深度学习的单目深度估计方法——最新综述。
Sensors (Basel). 2020 Apr 16;20(8):2272. doi: 10.3390/s20082272.
7
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
8
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers.基于拉普拉斯图像金字塔和局部平面引导层的单目深度估计
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
9
Laplacian Pyramid Neural Network for Dense Continuous-Value Regression for Complex Scenes.用于复杂场景密集连续值回归的拉普拉斯金字塔神经网络。
IEEE Trans Neural Netw Learn Syst. 2021 Nov;32(11):5034-5046. doi: 10.1109/TNNLS.2020.3026669. Epub 2021 Oct 27.
10
Advanced Monocular Outdoor Pose Estimation in Autonomous Systems: Leveraging Optical Flow, Depth Estimation, and Semantic Segmentation with Dynamic Object Removal.自主系统中的高级单目户外姿态估计:利用光流、深度估计和语义分割去除动态物体
Sensors (Basel). 2024 Dec 17;24(24):8040. doi: 10.3390/s24248040.

引用本文的文献

1
BN-SNN: Spiking neural networks with bistable neurons for object detection.BN-SNN:用于目标检测的具有双稳态神经元的脉冲神经网络
PLoS One. 2025 Jul 10;20(7):e0327513. doi: 10.1371/journal.pone.0327513. eCollection 2025.

本文引用的文献

1
An Interpretable Hand-Crafted Feature-Based Model for Atrial Fibrillation Detection.一种用于心房颤动检测的可解释的基于手工特征的模型。
Front Physiol. 2021 May 13;12:657304. doi: 10.3389/fphys.2021.657304. eCollection 2021.
2
Sparse Representations for Object- and Ego-Motion Estimations in Dynamic Scenes.动态场景中物体和自我运动估计的稀疏表示
IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2521-2534. doi: 10.1109/TNNLS.2020.3006467. Epub 2021 Jun 2.
3
Automatic Depth Extraction from 2D Images Using a Cluster-Based Learning Framework.
基于聚类学习框架的二维图像自动深度提取。
IEEE Trans Image Process. 2018 Jul;27(7):3288-3299. doi: 10.1109/TIP.2018.2813093.
4
Make3D: learning 3D scene structure from a single still image.Make3D:从单张静止图像学习3D场景结构。
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):824-40. doi: 10.1109/TPAMI.2008.132.