School of Automation Science and Electronic Engineering, Science and Technology on Aircraft Control Laboratory, Beihang University, Beijing, 100191, PR China.
School of Automation Science and Electronic Engineering, Science and Technology on Aircraft Control Laboratory, Beihang University, Beijing, 100191, PR China; Institute of Artificial Intelligence, Beihang University, Beijing, 100191, PR China.
ISA Trans. 2023 Jul;138:318-328. doi: 10.1016/j.isatra.2023.03.003. Epub 2023 Mar 8.
This paper studies the distributed time-varying output formation tracking problem for heterogeneous multi-agent systems with both diverse dimensions and parameters. The output of each follower is supposed to track that of the virtual leader while accomplishing a time-varying formation configuration. First, a distributed trajectory generator is proposed based on neighboring interactions to reconstitute the state of virtual leader and provide expected trajectories with the formation incorporated. Second, an optimal tracking controller is designed by the model-free reinforcement learning technique using online off-policy data instead of requiring any knowledge of the followers' dynamics. Stabilities of the learning process and resulting controller are analyzed while solutions to the output regulator equations are equivalently obtained. Third, a compensational input is designed for each follower based on previous learning results and a derived feasibility condition. It is proved that the output formation tracking error converges to zero asymptotically with the biases to cost functions being restricted arbitrarily small. Finally, numerical simulations verify the proposed learning and control scheme.
本文研究了维度和参数均不同的异构多智能体系统的分布式时变输出编队跟踪问题。每个跟随者的输出都需要跟踪虚拟领导者的输出,同时完成时变编队配置。首先,基于邻居间的相互作用,提出了一种分布式轨迹生成器来重构虚拟领导者的状态,并提供包含编队的期望轨迹。其次,利用无模型强化学习技术设计了最优跟踪控制器,通过在线的非策略数据进行学习,而无需任何关于跟随者动力学的知识。分析了学习过程和所得控制器的稳定性,同时等效地得到了输出调节器方程的解。然后,根据之前的学习结果和导出的可行性条件,为每个跟随者设计了补偿输入。证明了输出编队跟踪误差随代价函数的偏差任意小而渐近收敛到零。最后,数值仿真验证了所提出的学习和控制方案。