College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.
College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.
Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.
This paper addresses the tracking control problem of nonlinear discrete-time multi-agent systems (MASs). First, a local neighborhood error system (LNES) is constructed. Then, a novel tracking algorithm based on asynchronous iterative Q-learning (AIQL) is developed, which can transform the tracking problem into the optimal regulation of LNES. The AIQL-based algorithm has two Q values Q and Q for each agent i, where Q is used for improving the control policy and Q is used for evaluating the value of the control policy. Moreover, the convergence of LNES is given. It is shown that the LNES converges to 0 and the tracking problem is solved. A neural network-based actor-critic framework is used to implement AIQL. The critic network of AIQL is composed of two neural networks, which are used for approximating Q and Q respectively. Finally, simulation results are given to verify the performance of the developed algorithm. It is shown that the AIQL-based tracking algorithm has a lower cost value and faster convergence speed than the IQL-based tracking algorithm.
本文针对非线性离散时间多智能体系统(MASs)的跟踪控制问题进行了研究。首先,构建了一个局部邻域误差系统(LNES)。然后,提出了一种基于异步迭代 Q 学习(AIQL)的新型跟踪算法,该算法可以将跟踪问题转化为 LNES 的最优调节问题。基于 AIQL 的算法为每个智能体 i 都有两个 Q 值 Q 和 Q,其中 Q 用于改进控制策略,Q 用于评估控制策略的价值。此外,还给出了 LNES 的收敛性。结果表明,LNES 收敛到 0,跟踪问题得到解决。采用神经网络的 Actor-Critic 框架来实现 AIQL。AIQL 的评价网络由两个神经网络组成,分别用于逼近 Q 和 Q。最后,给出了仿真结果以验证所提出算法的性能。结果表明,基于 AIQL 的跟踪算法的代价值更低,收敛速度更快。