基于软演员-评论家算法和生成对抗模仿学习的机器人操纵器轨迹跟踪控制

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning.

作者信息

Hu Jintao, Wang Fujie, Li Xing, Qin Yi, Guo Fang, Jiang Ming

机构信息

School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China.

出版信息

Biomimetics (Basel). 2024 Dec 21;9(12):779. doi: 10.3390/biomimetics9120779.

DOI:10.3390/biomimetics9120779

PMID:39727785

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11727619/

Abstract

In this paper, a deep reinforcement learning (DRL) approach based on generative adversarial imitation learning (GAIL) and long short-term memory (LSTM) is proposed to resolve tracking control problems for robotic manipulators with saturation constraints and random disturbances, without learning the dynamic and kinematic model of the manipulator. Specifically, it limits the torque and joint angle to a certain range. Firstly, in order to cope with the instability problem during training and obtain a stability policy, soft actor-critic (SAC) and LSTM are combined. The changing trends of joint position over time are more comprehensively captured and understood by employing an LSTM architecture designed for robotic manipulator systems, thereby reducing instability during the training of robotic manipulators for tracking control tasks. Secondly, the obtained policy by SAC-LSTM is used as expert data for GAIL to learn a better control policy. This SAC-LSTM-GAIL (SL-GAIL) algorithm does not need to spend time exploring unknown environments and directly learns the control strategy from stable expert data. Finally, it is demonstrated by the simulation results that the end effector of the robot tracking task is effectively accomplished by the proposed SL-GAIL algorithm, and more superior stability is exhibited in a test environment with interference compared with other algorithms.

摘要

本文提出了一种基于生成对抗模仿学习（GAIL）和长短期记忆（LSTM）的深度强化学习（DRL）方法，用于解决具有饱和约束和随机干扰的机器人操纵器的跟踪控制问题，而无需学习操纵器的动力学和运动学模型。具体而言，它将扭矩和关节角度限制在一定范围内。首先，为了应对训练过程中的不稳定性问题并获得稳定策略，将软演员-评论家（SAC）和LSTM相结合。通过采用为机器人操纵器系统设计的LSTM架构，可以更全面地捕捉和理解关节位置随时间的变化趋势，从而减少机器人操纵器跟踪控制任务训练期间的不稳定性。其次，将SAC-LSTM获得的策略用作GAIL的专家数据，以学习更好的控制策略。这种SAC-LSTM-GAIL（SL-GAIL）算法无需花费时间探索未知环境，而是直接从稳定的专家数据中学习控制策略。最后，仿真结果表明，所提出的SL-GAIL算法有效地完成了机器人跟踪任务的末端执行器，并且在存在干扰的测试环境中比其他算法表现出更优越的稳定性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c1f8/11727619/41fccb9de902/biomimetics-09-00779-g001.jpg

相似文献

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning.

Biomimetics (Basel). 2024 Dec 21;9(12):779. doi: 10.3390/biomimetics9120779.

End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.

Sensors (Basel). 2021 Sep 1;21(17):5893. doi: 10.3390/s21175893.

Restored Action Generative Adversarial Imitation Learning from observation for robot manipulator.

ISA Trans. 2022 Oct;129(Pt B):684-690. doi: 10.1016/j.isatra.2022.02.041. Epub 2022 Mar 7.

Distributional generative adversarial imitation learning with reproducing kernel generalization.

Neural Netw. 2023 Aug;165:43-59. doi: 10.1016/j.neunet.2023.05.027. Epub 2023 May 25.

Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model.

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3570-3578. doi: 10.1109/TNNLS.2019.2945019. Epub 2019 Nov 1.

Deep Reinforcement Learning Based Trajectory Planning Under Uncertain Constraints.

Front Neurorobot. 2022 May 2;16:883562. doi: 10.3389/fnbot.2022.883562. eCollection 2022.

A High-Efficient Reinforcement Learning Approach for Dexterous Manipulation.

Biomimetics (Basel). 2023 Jun 16;8(2):264. doi: 10.3390/biomimetics8020264.

Deep Reinforcement Learning-Based End-to-End Control for UAV Dynamic Target Tracking.

Biomimetics (Basel). 2022 Nov 11;7(4):197. doi: 10.3390/biomimetics7040197.

Deep reinforcement learning trajectory planning for robotic manipulator based on simulation-efficient training.

Sci Rep. 2025 Mar 10;15(1):8286. doi: 10.1038/s41598-025-93175-2.

Reinforcement Learning-Based Fixed-Time Trajectory Tracking Control for Uncertain Robotic Manipulators With Input Saturation.

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4584-4595. doi: 10.1109/TNNLS.2021.3116713. Epub 2023 Aug 4.

引用本文的文献

An improved multi-objective particle swarm optimization algorithm for the design of foundation pit of rail transit upper cover project.

Sci Rep. 2025 Mar 26;15(1):10403. doi: 10.1038/s41598-025-87350-8.

本文引用的文献

Curiosity model policy optimization for robotic manipulator tracking control with input saturation in uncertain environment.

Front Neurorobot. 2024 May 1;18:1376215. doi: 10.3389/fnbot.2024.1376215. eCollection 2024.

Transfer Learning in Deep Reinforcement Learning: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13344-13362. doi: 10.1109/TPAMI.2023.3292075. Epub 2023 Oct 3.

Distributional generative adversarial imitation learning with reproducing kernel generalization.

Neural Netw. 2023 Aug;165:43-59. doi: 10.1016/j.neunet.2023.05.027. Epub 2023 May 25.

Improving Exploration in Actor-Critic With Weakly Pessimistic Value Estimation and Optimistic Policy Optimization.

IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):8783-8796. doi: 10.1109/TNNLS.2022.3215596. Epub 2024 Jul 8.

Robust Fuzzy Q-Learning-Based Strictly Negative Imaginary Tracking Controllers for the Uncertain Quadrotor Systems.

IEEE Trans Cybern. 2023 Aug;53(8):5108-5120. doi: 10.1109/TCYB.2022.3175366. Epub 2023 Jul 18.

Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model.

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3570-3578. doi: 10.1109/TNNLS.2019.2945019. Epub 2019 Nov 1.

A Reinforcement Learning Neural Network for Robotic Manipulator Control.

Neural Comput. 2018 Jul;30(7):1983-2004. doi: 10.1162/neco_a_01079. Epub 2018 Apr 13.

Long short-term memory.

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于软演员-评论家算法和生成对抗模仿学习的机器人操纵器轨迹跟踪控制

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献