Wen Guoxing, Ge Shuzhi Sam, Tu Fangwen
IEEE Trans Neural Netw Learn Syst. 2018 Aug;29(8):3850-3862. doi: 10.1109/TNNLS.2018.2803726. Epub 2018 Mar 6.
In this paper, a control technique named optimized backstepping is first proposed by implementing tracking control for a class of strict-feedback systems, which considers optimization as a design philosophy of the high-order system control. The basic idea is that designing the actual and virtual controls of backstepping is the optimized solutions of the corresponding subsystems so that overall control of the high-order system is optimized. In general, optimization control is designed based on the solution of Hamilton-Jacobi-Bellman equation, but solving the equation is very difficult due to the inherent nonlinearity and intractability. In order to overcome the difficulty, the neural network (NN)-based reinforcement learning strategy of actor-critic architecture is used. In every backstepping step, the actor and critic NNs are constructed for executing control behavior and evaluating control performance, respectively. According to the Lyapunov stability theorem, it is proven that the desired control performance can be obtained. Finally, a simulation example is carried out to further demonstrate the effectiveness of the proposed control approach.
本文首次提出了一种名为优化反步法的控制技术,通过对一类严格反馈系统实施跟踪控制,将优化视为高阶系统控制的一种设计理念。其基本思想是,设计反步法的实际控制和虚拟控制是相应子系统的优化解,从而使高阶系统的整体控制得到优化。一般来说,优化控制是基于哈密顿 - 雅可比 - 贝尔曼方程的解来设计的,但由于其固有的非线性和难处理性,求解该方程非常困难。为了克服这一困难,采用了基于神经网络(NN)的演员 - 评论家架构的强化学习策略。在每个反步步骤中,分别构建演员神经网络和评论家神经网络来执行控制行为和评估控制性能。根据李雅普诺夫稳定性定理,证明了可以获得期望的控制性能。最后,进行了一个仿真示例,以进一步证明所提出控制方法的有效性。