• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于机器人技术的基于图像的强化学习的计算优化

Computational Optimization of Image-Based Reinforcement Learning for Robotics.

作者信息

Ferraro Stefano, Van de Maele Toon, Mazzaglia Pietro, Verbelen Tim, Dhoedt Bart

机构信息

IDLab, Ghent University, 25843 Ghent, Belgium.

出版信息

Sensors (Basel). 2022 Sep 28;22(19):7382. doi: 10.3390/s22197382.

DOI:10.3390/s22197382
PMID:36236477
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9571553/
Abstract

The robotics field has been deeply influenced by the advent of deep learning. In recent years, this trend has been characterized by the adoption of large, pretrained models for robotic use cases, which are not compatible with the computational hardware available in robotic systems. Moreover, such large, computationally intensive models impede the low-latency execution which is required for many closed-loop control systems. In this work, we propose different strategies for improving the computational efficiency of the deep-learning models adopted in reinforcement-learning (RL) scenarios. As a use-case project, we consider an image-based RL method on the synergy between push-and-grasp actions. As a first optimization step, we reduce the model architecture in complexity, by decreasing the number of layers and by altering the architecture structure. Second, we consider downscaling the input resolution to reduce the computational load. Finally, we perform weight quantization, where we compare post-training quantization and quantized-aware training. We benchmark the improvements introduced in each optimization by running a standard testing routine. We show that the optimization strategies introduced can improve the computational efficiency by around 300 times, while also slightly improving the functional performance of the system. In addition, we demonstrate closed-loop control behaviour on a real-world robot, while processing everything on a Jetson Xavier NX edge device.

摘要

机器人技术领域深受深度学习出现的影响。近年来,这种趋势的特点是在机器人用例中采用大型预训练模型,而这些模型与机器人系统中可用的计算硬件不兼容。此外,这种大型、计算密集型模型阻碍了许多闭环控制系统所需的低延迟执行。在这项工作中,我们提出了不同的策略来提高强化学习(RL)场景中采用的深度学习模型的计算效率。作为一个用例项目,我们考虑一种基于图像的RL方法,用于推和抓动作之间的协同作用。作为第一步优化,我们通过减少层数和改变架构结构来降低模型架构的复杂性。其次,我们考虑降低输入分辨率以减少计算负载。最后,我们进行权重量化,比较训练后量化和量化感知训练。我们通过运行标准测试程序来对每次优化中引入的改进进行基准测试。我们表明,引入的优化策略可以将计算效率提高约300倍,同时还能略微提高系统的功能性能。此外,我们在实际机器人上展示了闭环控制行为,同时在Jetson Xavier NX边缘设备上处理所有事情。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/e7ea49bc29b8/sensors-22-07382-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/35c5de70439b/sensors-22-07382-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/37a1a2fe7d47/sensors-22-07382-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/226e79d6e8b8/sensors-22-07382-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/35ea90e4d9f2/sensors-22-07382-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/3b166349a4c1/sensors-22-07382-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/816f1ab63e6c/sensors-22-07382-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/4006f19793ea/sensors-22-07382-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/43105c8df799/sensors-22-07382-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/e7ea49bc29b8/sensors-22-07382-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/35c5de70439b/sensors-22-07382-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/37a1a2fe7d47/sensors-22-07382-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/226e79d6e8b8/sensors-22-07382-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/35ea90e4d9f2/sensors-22-07382-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/3b166349a4c1/sensors-22-07382-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/816f1ab63e6c/sensors-22-07382-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/4006f19793ea/sensors-22-07382-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/43105c8df799/sensors-22-07382-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a30/9571553/e7ea49bc29b8/sensors-22-07382-g009.jpg

相似文献

1
Computational Optimization of Image-Based Reinforcement Learning for Robotics.用于机器人技术的基于图像的强化学习的计算优化
Sensors (Basel). 2022 Sep 28;22(19):7382. doi: 10.3390/s22197382.
2
Bio-inspired grasp control in a robotic hand with massive sensorial input.具有大量感官输入的机器人手中受生物启发的抓握控制。
Biol Cybern. 2009 Feb;100(2):109-28. doi: 10.1007/s00422-008-0279-0. Epub 2008 Dec 9.
3
Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference.Ps和Qs:用于高效低延迟神经网络推理的量化感知剪枝
Front Artif Intell. 2021 Jul 9;4:676564. doi: 10.3389/frai.2021.676564. eCollection 2021.
4
Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation.变分信息瓶颈正则化深度强化学习在机器人高效技能自适应中的应用。
Sensors (Basel). 2023 Jan 9;23(2):762. doi: 10.3390/s23020762.
5
RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS:动态环境下自主机器人导航的强化学习。
Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.
6
Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning.通过混合强化学习实现数据质量感知的混合精度量化
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9018-9031. doi: 10.1109/TNNLS.2024.3409692. Epub 2025 May 2.
7
Robot-Assisted Pedestrian Regulation Based on Deep Reinforcement Learning.基于深度强化学习的机器人辅助行人监管。
IEEE Trans Cybern. 2020 Apr;50(4):1669-1682. doi: 10.1109/TCYB.2018.2878977. Epub 2018 Nov 20.
8
GR-ConvNet v2: A Real-Time Multi-Grasp Detection Network for Robotic Grasping.GR-ConvNet v2:一种用于机器人抓取的实时多抓取检测网络。
Sensors (Basel). 2022 Aug 18;22(16):6208. doi: 10.3390/s22166208.
9
Human-to-Robot Handover Based on Reinforcement Learning.基于强化学习的人机交互。
Sensors (Basel). 2024 Sep 27;24(19):6275. doi: 10.3390/s24196275.
10
Combining expert neural networks using reinforcement feedback for learning primitive grasping behavior.结合使用强化反馈的专家神经网络来学习原始抓取行为。
IEEE Trans Neural Netw. 2004 May;15(3):629-38. doi: 10.1109/TNN.2004.824412.

引用本文的文献

1
FOCUS: object-centric world models for robotic manipulation.聚焦:用于机器人操作的以物体为中心的世界模型。
Front Neurorobot. 2025 Apr 30;19:1585386. doi: 10.3389/fnbot.2025.1585386. eCollection 2025.
2
An Overview of Computational Coronary Physiology Technologies Based on Medical Imaging and Artificial Intelligence.基于医学成像和人工智能的计算冠状动脉生理学技术概述
Rev Cardiovasc Med. 2024 Jun 13;25(6):211. doi: 10.31083/j.rcm2506211. eCollection 2024 Jun.

本文引用的文献

1
Active Vision for Robot Manipulators Using the Free Energy Principle.基于自由能原理的机器人操纵器主动视觉
Front Neurorobot. 2021 Mar 5;15:642780. doi: 10.3389/fnbot.2021.642780. eCollection 2021.
2
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
3
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
4
Convolutional face finder: a neural architecture for fast and robust face detection.卷积人脸探测器:一种用于快速且鲁棒的人脸检测的神经架构。
IEEE Trans Pattern Anal Mach Intell. 2004 Nov;26(11):1408-23. doi: 10.1109/tpami.2004.97.