• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度强化学习的不确定约束下轨迹规划

Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints.

作者信息

Chen Lienhung, Jiang Zhongliang, Cheng Long, Knoll Alois C, Zhou Mingchuan

机构信息

Department of Computer Science, Technische Universität München, Munich, Germany.

College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, China.

出版信息

Front Neurorobot. 2022 May 2;16:883562. doi: 10.3389/fnbot.2022.883562. eCollection 2022.

DOI:10.3389/fnbot.2022.883562
PMID:35586262
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9108367/
Abstract

With the advance in algorithms, deep reinforcement learning (DRL) offers solutions to trajectory planning under uncertain environments. Different from traditional trajectory planning which requires lots of effort to tackle complicated high-dimensional problems, the recently proposed DRL enables the robot manipulator to autonomously learn and discover optimal trajectory planning by interacting with the environment. In this article, we present state-of-the-art DRL-based collision-avoidance trajectory planning for uncertain environments such as a safe human coexistent environment. Since the robot manipulator operates in high dimensional continuous state-action spaces, model-free, policy gradient-based soft actor-critic (SAC), and deep deterministic policy gradient (DDPG) framework are adapted to our scenario for comparison. In order to assess our proposal, we simulate a 7-DOF Panda (Franka Emika) robot manipulator in the PyBullet physics engine and then evaluate its trajectory planning with reward, loss, safe rate, and accuracy. Finally, our final report shows the effectiveness of state-of-the-art DRL algorithms for trajectory planning under uncertain environments with zero collision after 5,000 episodes of training.

摘要

随着算法的进步,深度强化学习(DRL)为不确定环境下的轨迹规划提供了解决方案。与传统轨迹规划不同,传统轨迹规划需要付出大量努力来解决复杂的高维问题,而最近提出的DRL使机器人操纵器能够通过与环境交互自主学习并发现最优轨迹规划。在本文中,我们提出了基于DRL的最新技术,用于在诸如安全的人类共存环境等不确定环境中进行避碰轨迹规划。由于机器人操纵器在高维连续状态-动作空间中运行,因此将无模型、基于策略梯度的软演员-评论家(SAC)和深度确定性策略梯度(DDPG)框架应用于我们的场景进行比较。为了评估我们的提议,我们在PyBullet物理引擎中模拟了一个7自由度的熊猫(弗兰克·埃米卡)机器人操纵器,然后用奖励、损失、安全率和准确性来评估其轨迹规划。最后,我们的最终报告显示了最新的DRL算法在不确定环境下进行轨迹规划的有效性,经过5000次训练后实现了零碰撞。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/c8d665a7ed67/fnbot-16-883562-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/c2e5f82552c1/fnbot-16-883562-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/da96ee990f35/fnbot-16-883562-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/10ce7d6ea342/fnbot-16-883562-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/18e99101c445/fnbot-16-883562-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/39a9db3ba6af/fnbot-16-883562-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/992bfb005835/fnbot-16-883562-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/6c8b197a672d/fnbot-16-883562-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/c8d665a7ed67/fnbot-16-883562-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/c2e5f82552c1/fnbot-16-883562-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/da96ee990f35/fnbot-16-883562-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/10ce7d6ea342/fnbot-16-883562-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/18e99101c445/fnbot-16-883562-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/39a9db3ba6af/fnbot-16-883562-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/992bfb005835/fnbot-16-883562-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/6c8b197a672d/fnbot-16-883562-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/357a/9108367/c8d665a7ed67/fnbot-16-883562-g0008.jpg

相似文献

1
Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints.基于深度强化学习的不确定约束下轨迹规划
Front Neurorobot. 2022 May 2;16:883562. doi: 10.3389/fnbot.2022.883562. eCollection 2022.
2
Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment.未知环境下基于深度强化学习的移动机器人快速轨迹规划与控制的设计及实验验证
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5778-5792. doi: 10.1109/TNNLS.2022.3209154. Epub 2024 Apr 4.
3
Deep deterministic policy gradient algorithm: A systematic review.深度确定性策略梯度算法:一项系统综述。
Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.
4
Dual-Arm Robot Trajectory Planning Based on Deep Reinforcement Learning under Complex Environment.复杂环境下基于深度强化学习的双臂机器人轨迹规划
Micromachines (Basel). 2022 Mar 31;13(4):564. doi: 10.3390/mi13040564.
5
An Autonomous Path Planning Model for Unmanned Ships Based on Deep Reinforcement Learning.基于深度强化学习的无人船自主路径规划模型。
Sensors (Basel). 2020 Jan 11;20(2):426. doi: 10.3390/s20020426.
6
End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.基于软动作 - 批评家的端到端 AUV 运动规划方法。
Sensors (Basel). 2021 Sep 1;21(17):5893. doi: 10.3390/s21175893.
7
Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing.基于深度强化学习的行星软着陆精确控制
Sensors (Basel). 2021 Dec 6;21(23):8161. doi: 10.3390/s21238161.
8
Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces.基于模型的预测控制与强化学习用于垂直停车位的车辆泊车轨迹规划
Sensors (Basel). 2023 Aug 11;23(16):7124. doi: 10.3390/s23167124.
9
Predictive hierarchical reinforcement learning for path-efficient mapless navigation with moving target.具有移动目标的无图路径高效导航的预测分层强化学习。
Neural Netw. 2023 Aug;165:677-688. doi: 10.1016/j.neunet.2023.06.007. Epub 2023 Jun 10.
10
Improved Robot Path Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的改进型机器人路径规划方法。
Sensors (Basel). 2023 Jun 15;23(12):5622. doi: 10.3390/s23125622.

引用本文的文献

1
Robot movement planning for obstacle avoidance using reinforcement learning.基于强化学习的机器人避障运动规划
Sci Rep. 2025 Sep 12;15(1):32506. doi: 10.1038/s41598-025-17740-5.
2
Deep Reinforcement Learning Environment Approach Based on Nanocatalyst XAS Diagnostics Graphic Formalization.基于纳米催化剂X射线吸收光谱诊断图形形式化的深度强化学习环境方法
Materials (Basel). 2023 Jul 28;16(15):5321. doi: 10.3390/ma16155321.
3
Hierarchical Trajectory Planning for Narrow-Space Automated Parking with Deep Reinforcement Learning: A Federated Learning Scheme.

本文引用的文献

1
Artificial Intelligence and the Common Sense of Animals.人工智能与动物的常识
Trends Cogn Sci. 2020 Nov;24(11):862-872. doi: 10.1016/j.tics.2020.09.002. Epub 2020 Oct 8.
2
A Method on Dynamic Path Planning for Robotic Manipulator Autonomous Obstacle Avoidance Based on an Improved RRT Algorithm.一种基于改进RRT算法的机器人操作臂自主避障动态路径规划方法。
Sensors (Basel). 2018 Feb 13;18(2):571. doi: 10.3390/s18020571.
基于深度强化学习的狭窄空间自动泊车分层轨迹规划:联邦学习方案。
Sensors (Basel). 2023 Apr 18;23(8):4087. doi: 10.3390/s23084087.
4
Reinforcement learning based variable damping control of wearable robotic limbs for maintaining astronaut pose during extravehicular activity.基于强化学习的可穿戴机器人肢体可变阻尼控制,用于在舱外活动期间维持宇航员姿态。
Front Neurorobot. 2023 Feb 15;17:1093718. doi: 10.3389/fnbot.2023.1093718. eCollection 2023.
5
Realistic Actor-Critic: A framework for balance between value overestimation and underestimation.现实演员-评论家:一个用于平衡价值高估与低估的框架。
Front Neurorobot. 2023 Jan 9;16:1081242. doi: 10.3389/fnbot.2022.1081242. eCollection 2022.
6
An immediate-return reinforcement learning for the atypical Markov decision processes.针对非典型马尔可夫决策过程的即时回报强化学习。
Front Neurorobot. 2022 Dec 13;16:1012427. doi: 10.3389/fnbot.2022.1012427. eCollection 2022.
7
Implementing Monocular Visual-Tactile Sensors for Robust Manipulation.实现用于稳健操作的单目视觉触觉传感器。
Cyborg Bionic Syst. 2022 Sep 5;2022:9797562. doi: 10.34133/2022/9797562. eCollection 2022.
8
The modularization design and autonomous motion control of a new baby stroller.一款新型婴儿推车的模块化设计与自主运动控制
Front Hum Neurosci. 2022 Sep 30;16:1000382. doi: 10.3389/fnhum.2022.1000382. eCollection 2022.