• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于动态复杂运动规划系统的表征学习与强化学习

Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System.

作者信息

Zhou Chengmin, Huang Bingding, Franti Pasi

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11049-11063. doi: 10.1109/TNNLS.2023.3247160. Epub 2024 Aug 5.

DOI:10.1109/TNNLS.2023.3247160
PMID:37028017
Abstract

Indoor motion planning challenges researchers because of the high density and unpredictability of moving obstacles. Classical algorithms work well in the case of static obstacles but suffer from collisions in the case of dense and dynamic obstacles. Recent reinforcement learning (RL) algorithms provide safe solutions for multiagent robotic motion planning systems. However, these algorithms face challenges in convergence: slow convergence speed and suboptimal converged result. Inspired by RL and representation learning, we introduced the ALN-DSAC: a hybrid motion planning algorithm where attention-based long short-term memory (LSTM) and novel data replay combine with discrete soft actor-critic (SAC). First, we implemented a discrete SAC algorithm, which is the SAC in the setting of discrete action space. Second, we optimized existing distance-based LSTM encoding by attention-based encoding to improve the data quality. Third, we introduced a novel data replay method by combining the online learning and offline learning to improve the efficacy of data replay. The convergence of our ALN-DSAC outperforms that of the trainable state of the arts. Evaluations demonstrate that our algorithm achieves nearly 100% success with less time to reach the goal in motion planning tasks when compared to the state of the arts. The test code is available at https://github.com/CHUENGMINCHOU/ALN-DSAC.

摘要

室内运动规划给研究人员带来了挑战,因为移动障碍物的密度高且不可预测。经典算法在静态障碍物的情况下运行良好,但在面对密集和动态障碍物时容易发生碰撞。最近的强化学习(RL)算法为多智能体机器人运动规划系统提供了安全的解决方案。然而,这些算法在收敛方面面临挑战:收敛速度慢且收敛结果次优。受强化学习和表征学习的启发,我们引入了ALN-DSAC:一种混合运动规划算法,其中基于注意力的长短期记忆(LSTM)和新颖的数据重放与离散软演员评论家(SAC)相结合。首先,我们实现了一种离散SAC算法,即在离散动作空间设置下的SAC。其次,我们通过基于注意力的编码优化了现有的基于距离的LSTM编码,以提高数据质量。第三,我们通过结合在线学习和离线学习引入了一种新颖的数据重放方法,以提高数据重放的效率。我们的ALN-DSAC的收敛性能优于现有可训练的先进算法。评估表明,与现有技术相比,我们的算法在运动规划任务中实现了近100%的成功率,且到达目标的时间更短。测试代码可在https://github.com/CHUENGMINCHOU/ALN-DSAC获取。

相似文献

1
Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System.用于动态复杂运动规划系统的表征学习与强化学习
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11049-11063. doi: 10.1109/TNNLS.2023.3247160. Epub 2024 Aug 5.
2
Path Planning of a Mobile Robot for a Dynamic Indoor Environment Based on an SAC-LSTM Algorithm.基于SAC-LSTM算法的动态室内环境移动机器人路径规划
Sensors (Basel). 2023 Dec 13;23(24):9802. doi: 10.3390/s23249802.
3
End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.基于软动作 - 批评家的端到端 AUV 运动规划方法。
Sensors (Basel). 2021 Sep 1;21(17):5893. doi: 10.3390/s21175893.
4
Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.使用深度强化学习的多臂机械臂路径规划:带有后见之明经验回放的软动作-评论家。
Sensors (Basel). 2020 Oct 19;20(20):5911. doi: 10.3390/s20205911.
5
Adaptive Hybrid Optimization Learning-Based Accurate Motion Planning of Multi-Joint Arm.基于自适应混合优化学习的多关节手臂精确运动规划
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5440-5451. doi: 10.1109/TNNLS.2023.3262109. Epub 2023 Sep 1.
6
A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems.一种使用自注意力机制的优先经验回放演员-评论家算法,用于离散问题的策略优化。
PeerJ Comput Sci. 2024 Jun 28;10:e2161. doi: 10.7717/peerj-cs.2161. eCollection 2024.
7
A Path-Planning Method Based on Improved Soft Actor-Critic Algorithm for Mobile Robots.一种基于改进软演员-评论家算法的移动机器人路径规划方法。
Biomimetics (Basel). 2023 Oct 10;8(6):481. doi: 10.3390/biomimetics8060481.
8
Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.基于监督演员-评论员的强化学习算法在重症监护病房智能机械通气和镇静药物剂量调节中的应用
BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):124. doi: 10.1186/s12911-020-1120-5.
9
Safe Reinforcement Learning With Stability Guarantee for Motion Planning of Autonomous Vehicles.具有稳定性保证的自动驾驶车辆运动规划安全强化学习
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5435-5444. doi: 10.1109/TNNLS.2021.3084685. Epub 2021 Nov 30.
10
Actor-Critic Alignment for Offline-to-Online Reinforcement Learning.用于离线到在线强化学习的演员-评论家对齐
Proc Mach Learn Res. 2023 Jul;202:40452-40474.