IEEE Trans Cybern. 2013 Feb;43(1):206-16. doi: 10.1109/TSMCB.2012.2203336. Epub 2012 Jun 28.
In this paper, a near-optimal control scheme is proposed to solve the nonzero-sum differential games of continuous-time nonlinear systems. The single-network adaptive dynamic programming (ADP) is utilized to obtain the optimal control policies which make the cost functions reach the Nash equilibrium of nonzero-sum differential games, where only one critic network is used for each player instead of the action-critic dual network used in a typical ADP architecture. Furthermore, the novel weight tuning laws for critic neural networks are proposed, which not only ensure the Nash equilibrium to be reached but also guarantee the system to be stable. No initial stabilizing control policy is required for each player. Moreover, Lyapunov theory is utilized to demonstrate the uniform ultimate boundedness of the closed-loop system. Finally, a simulation example is given to verify the effectiveness of the proposed near-optimal control scheme.
本文提出了一种近优控制方案,用于解决连续时间非线性系统的非零和微分对策问题。利用单网络自适应动态规划(ADP)获得最优控制策略,使代价函数达到非零和微分对策的纳什均衡,其中每个玩家仅使用一个评价网络,而不是典型 ADP 架构中使用的动作-评价双网络。此外,还提出了新颖的评价神经网络权重调整律,不仅保证达到纳什均衡,而且保证系统稳定。每个玩家都不需要初始稳定控制策略。此外,利用李雅普诺夫理论证明了闭环系统的一致有界性。最后,通过仿真示例验证了所提出的近优控制方案的有效性。