使用深度强化学习的多臂机械臂路径规划：带有后见之明经验回放的软动作-评论家。

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.

机构信息

Department of Electrical and Information Engineering, Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul 01811, Korea.

Applied Robot R&D Department, Korea Institute of Industrial Technology (KITECH), Ansan 15588, Korea.

出版信息

Sensors (Basel). 2020 Oct 19;20(20):5911. doi: 10.3390/s20205911.

DOI:10.3390/s20205911

PMID:33086774

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7590214/

Abstract

Since path planning for multi-arm manipulators is a complicated high-dimensional problem, effective and fast path generation is not easy for the arbitrarily given start and goal locations of the end effector. Especially, when it comes to deep reinforcement learning-based path planning, high-dimensionality makes it difficult for existing reinforcement learning-based methods to have efficient exploration which is crucial for successful training. The recently proposed soft actor-critic (SAC) is well known to have good exploration ability due to the use of the entropy term in the objective function. Motivated by this, in this paper, a SAC-based path planning algorithm is proposed. The hindsight experience replay (HER) is also employed for sample efficiency and configuration space augmentation is used in order to deal with complicated configuration space of the multi-arms. To show the effectiveness of the proposed algorithm, both simulation and experiment results are given. By comparing with existing results, it is demonstrated that the proposed method outperforms the existing results.

摘要

由于多臂机械手的路径规划是一个复杂的高维问题，对于任意给定的末端执行器的起始和目标位置，有效和快速的路径生成并不容易。特别是在基于深度强化学习的路径规划中，高维性使得现有的基于强化学习的方法难以进行有效的探索，而探索对于成功的训练至关重要。最近提出的软动作-评论家（Soft Actor-Critic，SAC）由于在目标函数中使用了熵项，因此被公认为具有良好的探索能力。受此启发，本文提出了一种基于 SAC 的路径规划算法。还采用了后见之明经验回放（Hindsight Experience Replay，HER）来提高样本效率，并使用配置空间扩充来处理多臂复杂的配置空间。为了展示所提出算法的有效性，给出了仿真和实验结果。通过与现有结果的比较，证明了所提出的方法优于现有结果。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用深度强化学习的多臂机械臂路径规划：带有后见之明经验回放的软动作-评论家。

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

使用深度强化学习的多臂机械臂路径规划：带有后见之明经验回放的软动作-评论家。

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献