IEEE Trans Neural Netw Learn Syst. 2015 Aug;26(8):1776-88. doi: 10.1109/TNNLS.2015.2409301. Epub 2015 Mar 18.
The output feedback-based near-optimal regulation of uncertain and quantized nonlinear discrete-time systems in affine form with control constraint over finite horizon is addressed in this paper. First, the effect of input constraint is handled using a nonquadratic cost functional. Next, a neural network (NN)-based Luenberger observer is proposed to reconstruct both the system states and the control coefficient matrix so that a separate identifier is not needed. Then, approximate dynamic programming-based actor-critic framework is utilized to approximate the time-varying solution of the Hamilton-Jacobi-Bellman using NNs with constant weights and time-dependent activation functions. A new error term is defined and incorporated in the NN update law so that the terminal constraint error is also minimized over time. Finally, a novel dynamic quantizer for the control inputs with adaptive step size is designed to eliminate the quantization error overtime, thus overcoming the drawback of the traditional uniform quantizer. The proposed scheme functions in a forward-in-time manner without offline training phase. Lyapunov analysis is used to investigate the stability. Simulation results are given to show the effectiveness and feasibility of the proposed method.
本文研究了具有控制约束的不确定量化非线性离散时间仿射形式系统的基于输出反馈的近最优调节问题。首先,使用非二次代价函数处理输入约束的影响。其次,提出了一种基于神经网络(NN)的 Luenberger 观测器来重建系统状态和控制系数矩阵,因此不需要单独的标识符。然后,利用基于近似动态规划的动作-评论家框架,使用具有固定权重和时变激活函数的神经网络来近似时变的 Hamilton-Jacobi-Bellman 解。定义了一个新的误差项并将其纳入神经网络更新律中,以便随着时间的推移最小化终端约束误差。最后,设计了一种具有自适应步长的新型控制输入动态量化器,以随着时间的推移消除量化误差,从而克服传统均匀量化器的缺点。所提出的方案采用前向时间方式工作,无需离线训练阶段。使用 Lyapunov 分析来研究稳定性。仿真结果表明了所提出方法的有效性和可行性。