用于动态复杂运动规划系统的表征学习与强化学习

Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System.

作者信息

Zhou Chengmin, Huang Bingding, Franti Pasi

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11049-11063. doi: 10.1109/TNNLS.2023.3247160. Epub 2024 Aug 5.

DOI:10.1109/TNNLS.2023.3247160

Abstract

Indoor motion planning challenges researchers because of the high density and unpredictability of moving obstacles. Classical algorithms work well in the case of static obstacles but suffer from collisions in the case of dense and dynamic obstacles. Recent reinforcement learning (RL) algorithms provide safe solutions for multiagent robotic motion planning systems. However, these algorithms face challenges in convergence: slow convergence speed and suboptimal converged result. Inspired by RL and representation learning, we introduced the ALN-DSAC: a hybrid motion planning algorithm where attention-based long short-term memory (LSTM) and novel data replay combine with discrete soft actor-critic (SAC). First, we implemented a discrete SAC algorithm, which is the SAC in the setting of discrete action space. Second, we optimized existing distance-based LSTM encoding by attention-based encoding to improve the data quality. Third, we introduced a novel data replay method by combining the online learning and offline learning to improve the efficacy of data replay. The convergence of our ALN-DSAC outperforms that of the trainable state of the arts. Evaluations demonstrate that our algorithm achieves nearly 100% success with less time to reach the goal in motion planning tasks when compared to the state of the arts. The test code is available at https://github.com/CHUENGMINCHOU/ALN-DSAC.

摘要

室内运动规划给研究人员带来了挑战，因为移动障碍物的密度高且不可预测。经典算法在静态障碍物的情况下运行良好，但在面对密集和动态障碍物时容易发生碰撞。最近的强化学习（RL）算法为多智能体机器人运动规划系统提供了安全的解决方案。然而，这些算法在收敛方面面临挑战：收敛速度慢且收敛结果次优。受强化学习和表征学习的启发，我们引入了ALN-DSAC：一种混合运动规划算法，其中基于注意力的长短期记忆（LSTM）和新颖的数据重放与离散软演员评论家（SAC）相结合。首先，我们实现了一种离散SAC算法，即在离散动作空间设置下的SAC。其次，我们通过基于注意力的编码优化了现有的基于距离的LSTM编码，以提高数据质量。第三，我们通过结合在线学习和离线学习引入了一种新颖的数据重放方法，以提高数据重放的效率。我们的ALN-DSAC的收敛性能优于现有可训练的先进算法。评估表明，与现有技术相比，我们的算法在运动规划任务中实现了近100%的成功率，且到达目标的时间更短。测试代码可在https://github.com/CHUENGMINCHOU/ALN-DSAC获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于动态复杂运动规划系统的表征学习与强化学习

Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System.

作者信息

出版信息

相似文献

用于动态复杂运动规划系统的表征学习与强化学习

Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System.

作者信息

出版信息

相似文献