Suppr超能文献

基于软演员-评论家算法和生成对抗模仿学习的机器人操纵器轨迹跟踪控制

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning.

作者信息

Hu Jintao, Wang Fujie, Li Xing, Qin Yi, Guo Fang, Jiang Ming

机构信息

School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China.

出版信息

Biomimetics (Basel). 2024 Dec 21;9(12):779. doi: 10.3390/biomimetics9120779.

Abstract

In this paper, a deep reinforcement learning (DRL) approach based on generative adversarial imitation learning (GAIL) and long short-term memory (LSTM) is proposed to resolve tracking control problems for robotic manipulators with saturation constraints and random disturbances, without learning the dynamic and kinematic model of the manipulator. Specifically, it limits the torque and joint angle to a certain range. Firstly, in order to cope with the instability problem during training and obtain a stability policy, soft actor-critic (SAC) and LSTM are combined. The changing trends of joint position over time are more comprehensively captured and understood by employing an LSTM architecture designed for robotic manipulator systems, thereby reducing instability during the training of robotic manipulators for tracking control tasks. Secondly, the obtained policy by SAC-LSTM is used as expert data for GAIL to learn a better control policy. This SAC-LSTM-GAIL (SL-GAIL) algorithm does not need to spend time exploring unknown environments and directly learns the control strategy from stable expert data. Finally, it is demonstrated by the simulation results that the end effector of the robot tracking task is effectively accomplished by the proposed SL-GAIL algorithm, and more superior stability is exhibited in a test environment with interference compared with other algorithms.

摘要

本文提出了一种基于生成对抗模仿学习(GAIL)和长短期记忆(LSTM)的深度强化学习(DRL)方法,用于解决具有饱和约束和随机干扰的机器人操纵器的跟踪控制问题,而无需学习操纵器的动力学和运动学模型。具体而言,它将扭矩和关节角度限制在一定范围内。首先,为了应对训练过程中的不稳定性问题并获得稳定策略,将软演员-评论家(SAC)和LSTM相结合。通过采用为机器人操纵器系统设计的LSTM架构,可以更全面地捕捉和理解关节位置随时间的变化趋势,从而减少机器人操纵器跟踪控制任务训练期间的不稳定性。其次,将SAC-LSTM获得的策略用作GAIL的专家数据,以学习更好的控制策略。这种SAC-LSTM-GAIL(SL-GAIL)算法无需花费时间探索未知环境,而是直接从稳定的专家数据中学习控制策略。最后,仿真结果表明,所提出的SL-GAIL算法有效地完成了机器人跟踪任务的末端执行器,并且在存在干扰的测试环境中比其他算法表现出更优越 的稳定性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1f8/11727619/41fccb9de902/biomimetics-09-00779-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验