Yan Lei, Liu Junhe, Lai Guanyu, Philip Chen C L, Wu Zongze, Liu Zhi
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):5417-5427. doi: 10.1109/TNNLS.2024.3379503. Epub 2025 Feb 28.
Developing a distributed bipartite optimal consensus scheme while ensuring user-predefined performance is essential in practical applications. Existing approaches to this problem typically require a complex controller structure due to adopting an identifier-actor-critic framework and prescribed performance cannot be guaranteed. In this work, an adaptive critic learning (ACL)-based optimal bipartite consensus scheme is developed to bridge the gap. A newly designed error scaling function, which defines the user-predefined settling time and steady accuracy without relying on the initial conditions, is then integrated into a cost function. The backstepping framework combines the ACL and integral reinforcement learning (IRL) algorithm to develop the adaptive optimal bipartite consensus scheme, which contributes a critic-only controller structure by removing the identifier and actor networks in the existing methods. The adaptive law of the critic network is derived by the gradient descent algorithm and experience replay to minimize the IRL-based residual error. It is shown that a compute-saving learning mechanism can achieve the optimal consensus, and the error variables of the closed-loop system are uniformly ultimately bounded (UUB). Besides, in any bounded initial condition, the evolution of bipartite consensus is limited to a user-prescribed boundary under bounded initial conditions. The illustrative simulation results validate the efficacy of the approach.
在实际应用中,开发一种分布式二分最优共识方案并确保用户预定义的性能至关重要。由于采用了标识符-执行器-评论家框架,解决该问题的现有方法通常需要复杂的控制器结构,并且无法保证规定的性能。在这项工作中,开发了一种基于自适应评论家学习(ACL)的最优二分共识方案来弥补这一差距。一种新设计的误差缩放函数被集成到成本函数中,该函数在不依赖初始条件的情况下定义了用户预定义的调节时间和稳态精度。反步框架将ACL和积分强化学习(IRL)算法相结合,开发出自适应最优二分共识方案,该方案通过去除现有方法中的标识符和执行器网络,贡献了一种仅含评论家的控制器结构。评论家网络的自适应律由梯度下降算法和经验回放推导得出,以最小化基于IRL的残差误差。结果表明,一种节省计算的学习机制可以实现最优共识,并且闭环系统的误差变量是一致最终有界的(UUB)。此外,在任何有界初始条件下,二分共识的演化在有界初始条件下被限制在用户规定的边界内。说明性的仿真结果验证了该方法的有效性。