School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China.
Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI 02881, USA.
Neural Netw. 2018 Mar;99:19-30. doi: 10.1016/j.neunet.2017.11.022. Epub 2017 Dec 13.
This paper presents a novel adaptive dynamic programming(ADP)-based self-learning robust optimal control scheme for input-affine continuous-time nonlinear systems with mismatched disturbances. First, the stabilizing feedback controller for original nonlinear systems is designed by modifying the optimal control law of the auxiliary system. It is also demonstrated that this feedback controller can optimize a specified value function. Then, within the framework of ADP, a single critic network is constructed to solve the Hamilton-Jacobi-Bellman equation associated with the auxiliary system optimal control law. To update the critic network weights, an indicator function and a concurrent learning technique are employed. By using the proposed update law for the critic network, the restrictive conditions including the initial admissible control and the persistence of excitation condition are relaxed. Moreover, the stability of the closed-loop auxiliary system is guaranteed in the sense that all the signals are uniformly ultimately bounded. Finally, the applicability of the developed control strategy is illustrated through simulations for an unstable nonlinear plant and a power system.
本文提出了一种新颖的基于自适应动态规划(ADP)的自学习鲁棒最优控制方案,用于具有不匹配干扰的输入仿射连续时间非线性系统。首先,通过修改辅助系统的最优控制律,为原始非线性系统设计了稳定的反馈控制器。还证明了该反馈控制器可以优化指定的价值函数。然后,在 ADP 的框架内,构建了单个评论家网络来求解与辅助系统最优控制律相关的 Hamilton-Jacobi-Bellman 方程。为了更新评论家网络的权重,采用了一个指示函数和一种并发学习技术。通过使用所提出的评论家网络更新律,可以放宽包括初始可接受控制和激励持续条件在内的限制条件。此外,在闭环辅助系统的所有信号都是一致最终有界的意义上,保证了闭环辅助系统的稳定性。最后,通过对不稳定非线性植物和电力系统的仿真,说明了所开发控制策略的适用性。