基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.

机构信息

College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.

出版信息

Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.

DOI:10.1016/j.neunet.2024.106667

Abstract

This paper addresses the tracking control problem of nonlinear discrete-time multi-agent systems (MASs). First, a local neighborhood error system (LNES) is constructed. Then, a novel tracking algorithm based on asynchronous iterative Q-learning (AIQL) is developed, which can transform the tracking problem into the optimal regulation of LNES. The AIQL-based algorithm has two Q values Q and Q for each agent i, where Q is used for improving the control policy and Q is used for evaluating the value of the control policy. Moreover, the convergence of LNES is given. It is shown that the LNES converges to 0 and the tracking problem is solved. A neural network-based actor-critic framework is used to implement AIQL. The critic network of AIQL is composed of two neural networks, which are used for approximating Q and Q respectively. Finally, simulation results are given to verify the performance of the developed algorithm. It is shown that the AIQL-based tracking algorithm has a lower cost value and faster convergence speed than the IQL-based tracking algorithm.

摘要

本文针对非线性离散时间多智能体系统（MASs）的跟踪控制问题进行了研究。首先，构建了一个局部邻域误差系统（LNES）。然后，提出了一种基于异步迭代 Q 学习（AIQL）的新型跟踪算法，该算法可以将跟踪问题转化为 LNES 的最优调节问题。基于 AIQL 的算法为每个智能体 i 都有两个 Q 值 Q 和 Q，其中 Q 用于改进控制策略，Q 用于评估控制策略的价值。此外，还给出了 LNES 的收敛性。结果表明，LNES 收敛到 0，跟踪问题得到解决。采用神经网络的 Actor-Critic 框架来实现 AIQL。AIQL 的评价网络由两个神经网络组成，分别用于逼近 Q 和 Q。最后，给出了仿真结果以验证所提出算法的性能。结果表明，基于 AIQL 的跟踪算法的代价值更低，收敛速度更快。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.

机构信息

出版信息

相似文献

基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.

机构信息

出版信息

相似文献