He Shuping, Fang Haiyang, Zhang Maoguang, Liu Fei, Ding Zhengtao
IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):549-558. doi: 10.1109/TNNLS.2019.2905715. Epub 2019 Apr 11.
This paper studies the online adaptive optimal controller design for a class of nonlinear systems through a novel policy iteration (PI) algorithm. By using the technique of neural network linear differential inclusion (LDI) to linearize the nonlinear terms in each iteration, the optimal law for controller design can be solved through the relevant algebraic Riccati equation (ARE) without using the system internal parameters. Based on PI approach, the adaptive optimal control algorithm is developed with the online linearization and the two-step iteration, i.e., policy evaluation and policy improvement. The convergence of the proposed PI algorithm is also proved. Finally, two numerical examples are given to illustrate the effectiveness and applicability of the proposed method.
本文通过一种新颖的策略迭代(PI)算法研究了一类非线性系统的在线自适应最优控制器设计。通过使用神经网络线性微分包含(LDI)技术在每次迭代中对非线性项进行线性化,可以通过相关的代数黎卡提方程(ARE)求解控制器设计的最优律,而无需使用系统内部参数。基于PI方法,通过在线线性化和两步迭代(即策略评估和策略改进)开发了自适应最优控制算法。还证明了所提出的PI算法的收敛性。最后,给出了两个数值例子来说明所提方法的有效性和适用性。