• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器人操作的深度强化学习算法研究综述。

A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation.

机构信息

School of Electrical and Computer Engineering, University of Oklahoma, Norman, OK 73019, USA.

Department of Electronic and Electrical Engineering, University of Straclyde, Glasglow G1 1XW, UK.

出版信息

Sensors (Basel). 2023 Apr 5;23(7):3762. doi: 10.3390/s23073762.

DOI:10.3390/s23073762
PMID:37050822
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10098871/
Abstract

Robotic manipulation challenges, such as grasping and object manipulation, have been tackled successfully with the help of deep reinforcement learning systems. We give an overview of the recent advances in deep reinforcement learning algorithms for robotic manipulation tasks in this review. We begin by outlining the fundamental ideas of reinforcement learning and the parts of a reinforcement learning system. The many deep reinforcement learning algorithms, such as value-based methods, policy-based methods, and actor-critic approaches, that have been suggested for robotic manipulation tasks are then covered. We also examine the numerous issues that have arisen when applying these algorithms to robotics tasks, as well as the various solutions that have been put forth to deal with these issues. Finally, we highlight several unsolved research issues and talk about possible future directions for the subject.

摘要

机器人操作面临的挑战,如抓取和物体操作,已经在深度学习强化学习系统的帮助下得到了成功解决。在这篇综述中,我们概述了最近在机器人操作任务中用于深度学习强化学习算法的进展。我们首先概述了强化学习的基本思想和强化学习系统的组成部分。然后介绍了许多针对机器人操作任务提出的深度学习强化学习算法,如基于价值的方法、基于策略的方法和演员-评论家方法。我们还研究了在将这些算法应用于机器人任务时出现的许多问题,以及提出的各种解决方案来处理这些问题。最后,我们强调了几个未解决的研究问题,并讨论了该主题可能的未来方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/272cf3ab77bf/sensors-23-03762-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/c8dded02788f/sensors-23-03762-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/c52e12fd8f93/sensors-23-03762-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/0cb6306e70cf/sensors-23-03762-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/7839313b27c5/sensors-23-03762-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/984de1511d99/sensors-23-03762-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/03a683b080a7/sensors-23-03762-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/d5e20ca1edf4/sensors-23-03762-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/1e3adb86cac8/sensors-23-03762-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/1d3836e97431/sensors-23-03762-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/272cf3ab77bf/sensors-23-03762-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/c8dded02788f/sensors-23-03762-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/c52e12fd8f93/sensors-23-03762-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/0cb6306e70cf/sensors-23-03762-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/7839313b27c5/sensors-23-03762-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/984de1511d99/sensors-23-03762-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/03a683b080a7/sensors-23-03762-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/d5e20ca1edf4/sensors-23-03762-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/1e3adb86cac8/sensors-23-03762-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/1d3836e97431/sensors-23-03762-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/240a/10098871/272cf3ab77bf/sensors-23-03762-g010.jpg

相似文献

1
A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation.机器人操作的深度强化学习算法研究综述。
Sensors (Basel). 2023 Apr 5;23(7):3762. doi: 10.3390/s23073762.
2
Review of Learning-Based Robotic Manipulation in Cluttered Environments.基于学习的杂乱环境机器人操作综述。
Sensors (Basel). 2022 Oct 18;22(20):7938. doi: 10.3390/s22207938.
3
Object Manipulation with an Anthropomorphic Robotic Hand via Deep Reinforcement Learning with a Synergy Space of Natural Hand Poses.基于自然手位协同空间的深度强化学习的拟人机器人手操作
Sensors (Basel). 2021 Aug 5;21(16):5301. doi: 10.3390/s21165301.
4
The Actor-Dueling-Critic Method for Reinforcement Learning.强化学习中的演员-决斗-批评者方法。
Sensors (Basel). 2019 Mar 30;19(7):1547. doi: 10.3390/s19071547.
5
Learning Mobile Manipulation through Deep Reinforcement Learning.通过深度强化学习学习移动操作。
Sensors (Basel). 2020 Feb 10;20(3):939. doi: 10.3390/s20030939.
6
Meta attention for Off-Policy Actor-Critic.用于离策略演员-评论家的元注意力机制
Neural Netw. 2023 Jun;163:86-96. doi: 10.1016/j.neunet.2023.03.024. Epub 2023 Mar 28.
7
Variational Information Bottleneck Regularized Deep Reinforcement Learning for Efficient Robotic Skill Adaptation.变分信息瓶颈正则化深度强化学习在机器人高效技能自适应中的应用。
Sensors (Basel). 2023 Jan 9;23(2):762. doi: 10.3390/s23020762.
8
Dexterous Manipulation for Multi-Fingered Robotic Hands With Reinforcement Learning: A Review.基于强化学习的多指机器人手灵巧操作综述
Front Neurorobot. 2022 Apr 25;16:861825. doi: 10.3389/fnbot.2022.861825. eCollection 2022.
9
Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning.通过深度强化学习的仿生水下航行器目标跟踪控制。
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3741-3752. doi: 10.1109/TNNLS.2021.3054402. Epub 2022 Aug 3.
10
Impedance learning for robotic contact tasks using natural actor-critic algorithm.使用自然演员-评论家算法的机器人接触任务阻抗学习
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):433-43. doi: 10.1109/TSMCB.2009.2026289. Epub 2009 Aug 18.

引用本文的文献

1
Intelligent generation and optimization of resources in music teaching reform based on artificial intelligence and deep learning.基于人工智能与深度学习的音乐教学改革中资源的智能生成与优化
Sci Rep. 2025 Aug 17;15(1):30051. doi: 10.1038/s41598-025-16458-8.
2
Robotic Arm Trajectory Planning in Dynamic Environments Based on Self-Optimizing Replay Mechanism.基于自优化重放机制的动态环境下机器人手臂轨迹规划
Sensors (Basel). 2025 Jul 29;25(15):4681. doi: 10.3390/s25154681.
3
The analysis of deep reinforcement learning for dynamic graphical games under artificial intelligence.

本文引用的文献

1
Prehensile and Non-Prehensile Robotic Pick-and-Place of Objects in Clutter Using Deep Reinforcement Learning.使用深度强化学习在杂乱环境中进行可抓取和不可抓取的机器人物体抓取与放置。
Sensors (Basel). 2023 Jan 29;23(3):1513. doi: 10.3390/s23031513.
2
Standing Balance Control of a Bipedal Robot Based on Behavior Cloning.基于行为克隆的双足机器人站立平衡控制
Biomimetics (Basel). 2022 Dec 9;7(4):232. doi: 10.3390/biomimetics7040232.
3
Context meta-reinforcement learning via neuromodulation.通过神经调节实现情境元强化学习。
人工智能下动态图形游戏的深度强化学习分析
Sci Rep. 2025 Jul 2;15(1):23133. doi: 10.1038/s41598-025-05192-w.
4
Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection.用于表面检测的轮廓测量传感器轨迹优化的强化学习方法
Sensors (Basel). 2025 Apr 3;25(7):2271. doi: 10.3390/s25072271.
5
Compliant Motion Planning Integrating Human Skill for Robotic Arm Collecting Tomato Bunch Based on Improved DDPG.基于改进深度确定性策略梯度算法的、集成人类技能的机器人手臂采摘番茄串柔顺运动规划
Plants (Basel). 2025 Feb 20;14(5):634. doi: 10.3390/plants14050634.
6
Multi-robot hierarchical safe reinforcement learning autonomous decision-making strategy based on uniformly ultimate boundedness constraints.基于一致最终有界约束的多机器人分层安全强化学习自主决策策略
Sci Rep. 2025 Feb 18;15(1):5990. doi: 10.1038/s41598-025-89285-6.
7
Survey of learning-based approaches for robotic in-hand manipulation.基于学习的机器人手部操作方法综述。
Front Robot AI. 2024 Nov 5;11:1455431. doi: 10.3389/frobt.2024.1455431. eCollection 2024.
8
Learning Playing Piano with Bionic-Constrained Diffusion Policy for Anthropomorphic Hand.基于拟人化手部的仿生约束扩散策略学习弹钢琴
Cyborg Bionic Syst. 2024 May 17;5:0104. doi: 10.34133/cbsystems.0104. eCollection 2024.
9
Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review.强化学习算法及其在医疗保健与机器人技术中的应用:一项全面且系统的综述
Sensors (Basel). 2024 Apr 11;24(8):2461. doi: 10.3390/s24082461.
10
Intelligent selection of healthcare supply chain mode - an applied research based on artificial intelligence.智能选择医疗供应链模式——基于人工智能的应用研究。
Front Public Health. 2023 Dec 11;11:1310016. doi: 10.3389/fpubh.2023.1310016. eCollection 2023.
Neural Netw. 2022 Aug;152:70-79. doi: 10.1016/j.neunet.2022.04.003. Epub 2022 Apr 12.
4
Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.用于机器人平面推操作的视觉-本体感觉模型强化学习
Front Neurorobot. 2022 Mar 2;16:829437. doi: 10.3389/fnbot.2022.829437. eCollection 2022.
5
An Adaptive Imitation Learning Framework for Robotic Complex Contact-Rich Insertion Tasks.用于机器人复杂的富含接触的插入任务的自适应模仿学习框架
Front Robot AI. 2022 Jan 11;8:777363. doi: 10.3389/frobt.2021.777363. eCollection 2021.
6
Hybrid Imitation Learning Framework for Robotic Manipulation Tasks.机器人操作任务的混合模仿学习框架。
Sensors (Basel). 2021 May 13;21(10):3409. doi: 10.3390/s21103409.
7
Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning.机器人学习:深度强化学习、模仿学习、迁移学习。
Sensors (Basel). 2021 Feb 11;21(4):1278. doi: 10.3390/s21041278.
8
Learning agile and dynamic motor skills for legged robots.学习用于腿部机器人的敏捷和动态运动技能。
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aau5872.
9
Teacher-Student Curriculum Learning.师生课程学习
IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3732-3740. doi: 10.1109/TNNLS.2019.2934906. Epub 2019 Sep 9.
10
Deep reinforcement learning for de novo drug design.基于深度强化学习的从头药物设计。
Sci Adv. 2018 Jul 25;4(7):eaap7885. doi: 10.1126/sciadv.aap7885. eCollection 2018 Jul.