基于深度强化学习的改进深度确定性策略梯度算法的机器人抓取方法优化

Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning.

作者信息

Zhang Hongxu, Wang Fei, Wang Jianhui, Cui Ben

机构信息

College of Information Science and Engineering, Northeastern University, Shenyang, China.

Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China.

出版信息

Rev Sci Instrum. 2021 Feb 1;92(2):025114. doi: 10.1063/5.0034101.

DOI:10.1063/5.0034101

PMID:33648152

Abstract

Robot grasping has become a very hot research field so that the requirements for robot operation are getting higher and higher. In previous research studies, the use of traditional target detection algorithms for grasping is often very inefficient, and this article is dedicated to improving the deep reinforcement learning algorithm to improve the grasping efficiency and solve the problem of robots dealing with the impact of unknown disturbances on grasping. Using the characteristic that deep reinforcement learning actively explores the unknown environment, a Gaussian parameter Deep Deterministic Policy Gradient (Gaussian-DDPG) algorithm based on the Importance-Weighted Autoencoder (IWAE) is proposed to realize the robot's autonomous learning of the grasping task. Traditional coordinate positioning methods and deep learning methods have poor grasping effects for disturbed situations (such as the movement of the target object). The IWAE algorithm is used to compress the high-dimensional information of the original visual input to the hidden space and pass it to the deep reinforcement learning network as part of the state value. Based on the classic DDPG algorithm, it smoothly adds Gaussian parameters to improve the exploratory nature of the algorithm, dynamically sets the robot grasping space parameters to adapt to the workspace of multiple scales, and finally, realizes the accurate grasping of the robot. Relying on the possible position information deviation of the visual information, the control of the grasping position by the manipulator torque information is further optimized to improve the grasping efficiency of disturbed objects.

摘要

机器人抓取已成为一个非常热门的研究领域，以至于对机器人操作的要求越来越高。在以往的研究中，使用传统目标检测算法进行抓取往往效率很低，本文致力于改进深度强化学习算法，以提高抓取效率，并解决机器人应对未知干扰对抓取影响的问题。利用深度强化学习主动探索未知环境的特性，提出了一种基于重要性加权自编码器（IWAE）的高斯参数深度确定性策略梯度（Gaussian-DDPG）算法，以实现机器人对抓取任务的自主学习。传统的坐标定位方法和深度学习方法在受干扰情况（如目标物体移动）下的抓取效果较差。IWAE算法用于将原始视觉输入的高维信息压缩到隐藏空间，并作为状态值的一部分传递给深度强化学习网络。基于经典的DDPG算法，它平滑地添加高斯参数以提高算法的探索性，动态设置机器人抓取空间参数以适应多尺度工作空间，最终实现机器人的精确抓取。依靠视觉信息可能存在的位置信息偏差，进一步优化利用机械手扭矩信息对抓取位置的控制，以提高对受干扰物体的抓取效率。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于深度强化学习的改进深度确定性策略梯度算法的机器人抓取方法优化

Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

基于深度强化学习的改进深度确定性策略梯度算法的机器人抓取方法优化

Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献