IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2614-2624. doi: 10.1109/TNNLS.2017.2761718.
This paper proposes a novel data-driven control approach to address the problem of adaptive optimal tracking for a class of nonlinear systems taking the strict-feedback form. Adaptive dynamic programming (ADP) and nonlinear output regulation theories are integrated for the first time to compute an adaptive near-optimal tracker without any a priori knowledge of the system dynamics. Fundamentally different from adaptive optimal stabilization problems, the solution to a Hamilton-Jacobi-Bellman (HJB) equation, not necessarily a positive definite function, cannot be approximated through the existing iterative methods. This paper proposes a novel policy iteration technique for solving positive semidefinite HJB equations with rigorous convergence analysis. A two-phase data-driven learning method is developed and implemented online by ADP. The efficacy of the proposed adaptive optimal tracking control methodology is demonstrated via a Van der Pol oscillator with time-varying exogenous signals.
本文提出了一种新颖的数据驱动控制方法,用于解决一类严格反馈形式的非线性系统的自适应最优跟踪问题。自适应动态规划(ADP)和非线性输出调节理论首次被集成在一起,以计算自适应近最优跟踪器,而无需任何系统动力学的先验知识。与自适应最优稳定问题根本不同的是,哈密顿-雅可比-贝尔曼(HJB)方程的解,不一定是正定函数,不能通过现有的迭代方法来近似。本文提出了一种新的策略迭代技术,用于解决正定半定 HJB 方程,并进行了严格的收敛分析。通过 ADP 在线开发和实施了一种两阶段数据驱动学习方法。所提出的自适应最优跟踪控制方法的有效性通过具有时变外部信号的范德波尔振荡器得到了验证。