• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于实时自动驾驶系统的单目深度估计的合成数据增强与网络压缩技术

Synthetic Data Enhancement and Network Compression Technology of Monocular Depth Estimation for Real-Time Autonomous Driving System.

作者信息

Jun Woomin, Yoo Jisang, Lee Sungjin

机构信息

Electronic Engineering, Dong Seoul University, Seongnam 13117, Republic of Korea.

Autonomous Driving Lab, Modulabs, Seoul 06252, Republic of Korea.

出版信息

Sensors (Basel). 2024 Jun 28;24(13):4205. doi: 10.3390/s24134205.

DOI:10.3390/s24134205
PMID:39000982
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11243791/
Abstract

Accurate 3D image recognition, critical for autonomous driving safety, is shifting from the LIDAR-based point cloud to camera-based depth estimation technologies driven by cost considerations and the point cloud's limitations in detecting distant small objects. This research aims to enhance MDE (Monocular Depth Estimation) using a single camera, offering extreme cost-effectiveness in acquiring 3D environmental data. In particular, this paper focuses on novel data augmentation methods designed to enhance the accuracy of MDE. Our research addresses the challenge of limited MDE data quantities by proposing the use of synthetic-based augmentation techniques: Mask, Mask-Scale, and CutFlip. The implementation of these synthetic-based data augmentation strategies has demonstrably enhanced the accuracy of MDE models by 4.0% compared to the original dataset. Furthermore, this study introduces the RMS (Real-time Monocular Depth Estimation configuration considering Resolution, Efficiency, and Latency) algorithm, designed for the optimization of neural networks to augment the performance of contemporary monocular depth estimation technologies through a three-step process. Initially, it selects a model based on minimum latency and REL criteria, followed by refining the model's accuracy using various data augmentation techniques and loss functions. Finally, the refined model is compressed using quantization and pruning techniques to minimize its size for efficient on-device real-time applications. Experimental results from implementing the RMS algorithm indicated that, within the required latency and size constraints, the IEBins model exhibited the most accurate REL (absolute RELative error) performance, achieving a 0.0480 REL. Furthermore, the data augmentation combination of the original dataset with Flip, Mask, and CutFlip, alongside the loss function, displayed the best REL performance, with a score of 0.0461. The network compression technique using FP16 was analyzed as the most effective, reducing the model size by 83.4% compared to the original while maintaining the least impact on REL performance and latency. Finally, the performance of the RMS algorithm was validated on the on-device autonomous driving platform, NVIDIA Jetson AGX Orin, through which optimal deployment strategies were derived for various applications and scenarios requiring autonomous driving technologies.

摘要

精确的3D图像识别对自动驾驶安全至关重要,由于成本因素以及点云在检测远处小物体方面的局限性,它正从基于激光雷达的点云转向基于摄像头的深度估计技术。本研究旨在使用单摄像头增强单目深度估计(MDE),在获取3D环境数据方面提供极高的成本效益。特别是,本文重点关注旨在提高MDE准确性的新型数据增强方法。我们的研究通过提出使用基于合成的增强技术(掩码、掩码比例和裁剪翻转)来应对MDE数据量有限的挑战。与原始数据集相比,这些基于合成的数据增强策略的实施已将MDE模型的准确性显著提高了4.0%。此外,本研究引入了RMS(考虑分辨率、效率和延迟的实时单目深度估计配置)算法,该算法旨在通过三个步骤优化神经网络,以增强当代单目深度估计技术的性能。首先,它根据最小延迟和REL标准选择模型,然后使用各种数据增强技术和损失函数提高模型的准确性。最后,使用量化和剪枝技术对优化后的模型进行压缩,以最小化其大小,实现高效的设备上实时应用。实施RMS算法的实验结果表明,在所需的延迟和大小限制内,IEBins模型表现出最准确的REL(绝对相对误差)性能,达到0.048左右。此外,原始数据集与翻转、掩码和裁剪翻转的数据增强组合以及损失函数显示出最佳的REL性能,得分为0.0461。分析得出使用FP16的网络压缩技术最为有效,与原始模型相比,模型大小减少了83.4%,同时对REL性能和延迟的影响最小。最后,通过在设备上的自动驾驶平台NVIDIA Jetson AGX Orin上验证了RMS算法的性能,从中得出了针对各种需要自动驾驶技术的应用和场景的最佳部署策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/38bb0b5e0de4/sensors-24-04205-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/1185af6d1c27/sensors-24-04205-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/74d7288c3f85/sensors-24-04205-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/7e342dd381a3/sensors-24-04205-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/f77bf5e4ea07/sensors-24-04205-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/df14e9474268/sensors-24-04205-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/771cc666d82b/sensors-24-04205-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/38bb0b5e0de4/sensors-24-04205-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/1185af6d1c27/sensors-24-04205-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/74d7288c3f85/sensors-24-04205-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/7e342dd381a3/sensors-24-04205-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/f77bf5e4ea07/sensors-24-04205-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/df14e9474268/sensors-24-04205-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/771cc666d82b/sensors-24-04205-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/31e7/11243791/38bb0b5e0de4/sensors-24-04205-g007.jpg

相似文献

1
Synthetic Data Enhancement and Network Compression Technology of Monocular Depth Estimation for Real-Time Autonomous Driving System.用于实时自动驾驶系统的单目深度估计的合成数据增强与网络压缩技术
Sensors (Basel). 2024 Jun 28;24(13):4205. doi: 10.3390/s24134205.
2
Optimal Configuration of Multi-Task Learning for Autonomous Driving.用于自动驾驶的多任务学习的最优配置
Sensors (Basel). 2023 Dec 9;23(24):9729. doi: 10.3390/s23249729.
3
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers.基于拉普拉斯图像金字塔和局部平面引导层的单目深度估计
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
4
Monocular Depth Estimation from a Fisheye Camera Based on Knowledge Distillation.基于知识蒸馏的鱼眼相机单目深度估计
Sensors (Basel). 2023 Dec 16;23(24):9866. doi: 10.3390/s23249866.
5
SLAM-based dense surface reconstruction in monocular Minimally Invasive Surgery and its application to Augmented Reality.基于 SLAM 的单目微创手术中密集表面重建及其在增强现实中的应用。
Comput Methods Programs Biomed. 2018 May;158:135-146. doi: 10.1016/j.cmpb.2018.02.006. Epub 2018 Feb 8.
6
Deep Monocular Depth Estimation Based on Content and Contextual Features.基于内容和上下文特征的深度单目深度估计。
Sensors (Basel). 2023 Mar 8;23(6):2919. doi: 10.3390/s23062919.
7
TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction.轨迹NAS:用于轨迹预测的神经架构搜索
Sensors (Basel). 2024 Sep 1;24(17):5696. doi: 10.3390/s24175696.
8
Deep Learning-Based Monocular 3D Object Detection with Refinement of Depth Information.基于深度学习的具有深度信息细化的单目 3D 目标检测。
Sensors (Basel). 2022 Mar 28;22(7):2576. doi: 10.3390/s22072576.
9
MonoAux: Fully Exploiting Auxiliary Information and Uncertainty for Monocular 3D Object Detection.单目辅助(MonoAux):充分利用辅助信息和不确定性进行单目3D目标检测
Cyborg Bionic Syst. 2024 Mar 27;5:0097. doi: 10.34133/cbsystems.0097. eCollection 2024.
10
A Novel Method for Estimating Monocular Depth Using Cycle GAN and Segmentation.一种使用循环生成对抗网络(Cycle GAN)和分割技术估计单目深度的新方法。
Sensors (Basel). 2020 Apr 30;20(9):2567. doi: 10.3390/s20092567.

引用本文的文献

1
A Self-Supervised Few-Shot Semantic Segmentation Method Based on Multi-Task Learning and Dense Attention Computation.一种基于多任务学习和密集注意力计算的自监督少样本语义分割方法。
Sensors (Basel). 2024 Jul 31;24(15):4975. doi: 10.3390/s24154975.

本文引用的文献

1
G2-MonoDepth: A General Framework of Generalized Depth Inference From Monocular RGB+X Data.G2-MonoDepth:一种从单目RGB+X数据进行广义深度推断的通用框架。
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3753-3771. doi: 10.1109/TPAMI.2023.3346466. Epub 2024 Apr 3.
2
An Evidential Multi-Target Domain Adaptation Method Based on Weighted Fusion for Cross-Domain Pattern Classification.一种基于加权融合的跨域模式分类证据多目标域自适应方法。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14218-14232. doi: 10.1109/TNNLS.2023.3275759. Epub 2024 Oct 7.
3
A Method of Deep Learning Model Optimization for Image Classification on Edge Device.
一种用于边缘设备图像分类的深度学习模型优化方法。
Sensors (Basel). 2022 Sep 27;22(19):7344. doi: 10.3390/s22197344.
4
High quality monocular depth estimation with parallel decoder.高质量单目深度估计的并行解码器。
Sci Rep. 2022 Oct 5;12(1):16616. doi: 10.1038/s41598-022-20909-x.
5
Monocular Depth Estimation Using Deep Learning: A Review.基于深度学习的单目深度估计研究综述。
Sensors (Basel). 2022 Jul 18;22(14):5353. doi: 10.3390/s22145353.
6
An Improved Multi-Source Data Fusion Method Based on the Belief Entropy and Divergence Measure.一种基于信念熵和散度测度的改进多源数据融合方法。
Entropy (Basel). 2019 Jun 20;21(6):611. doi: 10.3390/e21060611.
7
Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks.使用多尺度连续条件随机场作为序列深度网络的单目深度估计
IEEE Trans Pattern Anal Mach Intell. 2019 Jun;41(6):1426-1440. doi: 10.1109/TPAMI.2018.2839602. Epub 2018 May 22.