Li Ruihong, Gan Qintao, Ren Guoquan, Wu Huaiqin, Cao Jinde
IEEE Trans Cybern. 2025 Sep;55(9):4182-4195. doi: 10.1109/TCYB.2025.3583368.
This article aims to address the fixed-time optimal leader-following consensus issue for unknown multiagent systems (MASs) under Denial of Service (DoS) and false data injection (FDI) attacks. A novel fixed-time stability theorem under DoS attacks is presented to simplify the stability conditions and decrease the computational complexity of the settling time. Simultaneously, the deep neural networks (DNNs) structure with the projection operator are adopted in real-time to approximate the unknown system dynamics. To achieve the optimal consensus under cyber-attacks, a hierarchical control approach is presented, which includes a reference signal generation layer and a tracking control layer. Specifically, the distributed and Luenberger-based observers are designed in the reference signal generation layer to solve the fixed-time state estimation issues of leader and followers under multiple malicious attacks, respectively. Then, the optimal control strategy based on the event-triggered mechanism (ETM) is designed in the tracking control layer to track the reference signal and minimize the cost consumption. Due to the difficulty in obtaining explicit expressions of the optimal control mechanisms, a critic-only reinforcement learning (RL)-based algorithm is presented for online learning the unknown weight within a fixed time. By rigorous proof, the developed observers can achieve the fixed-time state reconstruction and the optimal control policy can track observation states after a fixed time. Finally, simulation results about platooning control of automated vehicles are given to demonstrate the efficacy of the developed technique.
本文旨在解决拒绝服务(DoS)和虚假数据注入(FDI)攻击下未知多智能体系统(MAS)的固定时间最优领导者跟随一致性问题。提出了一种新颖的DoS攻击下的固定时间稳定性定理,以简化稳定性条件并降低收敛时间的计算复杂度。同时,实时采用带有投影算子的深度神经网络(DNN)结构来逼近未知系统动态。为了在网络攻击下实现最优一致性,提出了一种分层控制方法,该方法包括参考信号生成层和跟踪控制层。具体而言,在参考信号生成层设计了分布式和基于Luenberger的观测器,分别解决多种恶意攻击下领导者和跟随者的固定时间状态估计问题。然后,在跟踪控制层设计基于事件触发机制(ETM)的最优控制策略,以跟踪参考信号并最小化成本消耗。由于难以获得最优控制机制的显式表达式,提出了一种仅基于评论家的强化学习(RL)算法,用于在固定时间内在线学习未知权重。通过严格证明,所开发的观测器可以实现固定时间状态重构,并且最优控制策略可以在固定时间后跟踪观测状态。最后,给出了关于自动驾驶车辆编队控制的仿真结果,以证明所开发技术的有效性。