Suppr超能文献

基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.

机构信息

College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.

College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.

出版信息

Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.

Abstract

This paper addresses the tracking control problem of nonlinear discrete-time multi-agent systems (MASs). First, a local neighborhood error system (LNES) is constructed. Then, a novel tracking algorithm based on asynchronous iterative Q-learning (AIQL) is developed, which can transform the tracking problem into the optimal regulation of LNES. The AIQL-based algorithm has two Q values Q and Q for each agent i, where Q is used for improving the control policy and Q is used for evaluating the value of the control policy. Moreover, the convergence of LNES is given. It is shown that the LNES converges to 0 and the tracking problem is solved. A neural network-based actor-critic framework is used to implement AIQL. The critic network of AIQL is composed of two neural networks, which are used for approximating Q and Q respectively. Finally, simulation results are given to verify the performance of the developed algorithm. It is shown that the AIQL-based tracking algorithm has a lower cost value and faster convergence speed than the IQL-based tracking algorithm.

摘要

本文针对非线性离散时间多智能体系统(MASs)的跟踪控制问题进行了研究。首先,构建了一个局部邻域误差系统(LNES)。然后,提出了一种基于异步迭代 Q 学习(AIQL)的新型跟踪算法,该算法可以将跟踪问题转化为 LNES 的最优调节问题。基于 AIQL 的算法为每个智能体 i 都有两个 Q 值 Q 和 Q,其中 Q 用于改进控制策略,Q 用于评估控制策略的价值。此外,还给出了 LNES 的收敛性。结果表明,LNES 收敛到 0,跟踪问题得到解决。采用神经网络的 Actor-Critic 框架来实现 AIQL。AIQL 的评价网络由两个神经网络组成,分别用于逼近 Q 和 Q。最后,给出了仿真结果以验证所提出算法的性能。结果表明,基于 AIQL 的跟踪算法的代价值更低,收敛速度更快。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验