Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning. - Suppr | 超能文献

文献检索
文档翻译
深度研究
学术资讯

Zotero 插件

邀请有礼
套餐&价格
历史记录

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning.

作者信息

Peng Zhinan, Luo Rui, Hu Jiangping, Shi Kaibo, Nguang Sing Kiong, Ghosh Bijoy Kumar

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):4043-4055. doi: 10.1109/TNNLS.2021.3055761. Epub 2022 Aug 3.

DOI:10.1109/TNNLS.2021.3055761

Abstract

In this article, a novel reinforcement learning (RL) method is developed to solve the optimal tracking control problem of unknown nonlinear multiagent systems (MASs). Different from the representative RL-based optimal control algorithms, an internal reinforce Q-learning (IrQ-L) method is proposed, in which an internal reinforce reward (IRR) function is introduced for each agent to improve its capability of receiving more long-term information from the local environment. In the IrQL designs, a Q-function is defined on the basis of IRR function and an iterative IrQL algorithm is developed to learn optimally distributed control scheme, followed by the rigorous convergence and stability analysis. Furthermore, a distributed online learning framework, namely, reinforce-critic-actor neural networks, is established in the implementation of the proposed approach, which is aimed at estimating the IRR function, the Q-function, and the optimal control scheme, respectively. The implemented procedure is designed in a data-driven way without needing knowledge of the system dynamics. Finally, simulations and comparison results with the classical method are given to demonstrate the effectiveness of the proposed tracking control method.

摘要

相似文献

1

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning.

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):4043-4055. doi: 10.1109/TNNLS.2021.3055761. Epub 2022 Aug 3.

2

Optimal Tracking Control of a Nonlinear Multiagent System Using Q-Learning via Event-Triggered Reinforcement Learning.基于事件触发强化学习的Q学习对非线性多智能体系统的最优跟踪控制

Entropy (Basel). 2023 Feb 5;25(2):299. doi: 10.3390/e25020299.

3

Data-Based Optimal Consensus Control for Multiagent Systems With Policy Gradient Reinforcement Learning.基于数据的多智能体系统最优共识控制与策略梯度强化学习

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3872-3883. doi: 10.1109/TNNLS.2021.3054685. Epub 2022 Aug 3.

4

Data-Driven Optimal Bipartite Consensus Control for Second-Order Multiagent Systems via Policy Gradient Reinforcement Learning.基于策略梯度强化学习的二阶多智能体系统数据驱动最优二分共识控制

IEEE Trans Cybern. 2024 Jun;54(6):3468-3478. doi: 10.1109/TCYB.2023.3276797. Epub 2024 May 30.

5

Event-Triggered Multigradient Recursive Reinforcement Learning Tracking Control for Multiagent Systems.多智能体系统的事件触发多梯度递归强化学习跟踪控制

IEEE Trans Neural Netw Learn Syst. 2023 Jan;34(1):366-379. doi: 10.1109/TNNLS.2021.3094901. Epub 2023 Jan 5.

6

Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games.多智能体图博弈中的非策略强化学习同步。

IEEE Trans Neural Netw Learn Syst. 2017 Oct;28(10):2434-2445. doi: 10.1109/TNNLS.2016.2609500. Epub 2017 Apr 17.

7

Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems.用于非线性多智能体系统完全协作一致性问题的无模型强化学习

IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1482-1491. doi: 10.1109/TNNLS.2020.3042508. Epub 2022 Apr 4.

8

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.

9

Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm.基于迭代确定性Q学习算法的非线性输入仿射离散时间系统的无模型最优跟踪控制

IEEE Trans Neural Netw Learn Syst. 2022 Jun 3;PP. doi: 10.1109/TNNLS.2022.3178746.

10

Model-Free Optimal Tracking Control via Critic-Only Q-Learning.基于仅评价器 Q 学习的无模型最优跟踪控制。

IEEE Trans Neural Netw Learn Syst. 2016 Oct;27(10):2134-44. doi: 10.1109/TNNLS.2016.2585520. Epub 2016 Jul 12.

引用本文的文献

1

Linear matrix genetic programming as a tool for data-driven black-box control-oriented modeling in conditions of limited access to training data.线性矩阵遗传规划作为一种在训练数据获取受限条件下用于数据驱动的面向黑箱控制建模的工具。

Sci Rep. 2024 Jun 3;14(1):12666. doi: 10.1038/s41598-024-63419-8.

2

Optimal Tracking Control of a Nonlinear Multiagent System Using Q-Learning via Event-Triggered Reinforcement Learning.基于事件触发强化学习的Q学习对非线性多智能体系统的最优跟踪控制

Entropy (Basel). 2023 Feb 5;25(2):299. doi: 10.3390/e25020299.