Yan Lei, Liu Zhi, Chen C L Philip, Zhang Yun, Wu Zongze
School of Automation, Guangdong University of Technology, Guangzhou, Guangdong, 510006, China; School of Intelligent Manufacturing, Nanyang Institute of Technology, Nanyang, Henan, 473004, China.
School of Automation, Guangdong University of Technology, Guangzhou, Guangdong, 510006, China.
ISA Trans. 2023 Feb;133:29-41. doi: 10.1016/j.isatra.2022.07.006. Epub 2022 Jul 12.
Existing schemes for state-constrained systems either impose feasibility conditions or ignore the optimality. In this article, an adaptive optimal control scheme for the strict-feedback nonlinear system is proposed, which benefits from two design steps. Firstly, a novel nonlinear state-dependent function (NSDF) is formulated to equivalently transform the system into a non-constrained one to deal with state constraints without the requirements on feasibility conditions. Secondly, an adaptive optimal control scheme is designed for the non-constrained system, in which reinforcement learning (RL) is utilized to yield the optimal controller in each designing procedure. Updating rules of the actor and critic neural network are driven by the modified adaptive laws, used to approximate the optimal virtual and actual controllers. It is proved that all the signals in the closed-loop system are bounded and the output tracking error converges to an adjustable neighborhood of the origin not affected by the proposed NSDF. Two simulation examples are presented illustrating the effectiveness of the proposed scheme.
现有的状态约束系统方案要么施加可行性条件,要么忽略最优性。本文提出了一种用于严格反馈非线性系统的自适应最优控制方案,该方案得益于两个设计步骤。首先,构造了一种新颖的非线性状态依赖函数(NSDF),将系统等效地转化为无约束系统,以处理状态约束,而无需可行性条件。其次,为无约束系统设计了一种自适应最优控制方案,其中利用强化学习(RL)在每个设计过程中产生最优控制器。 actor和critic神经网络的更新规则由修改后的自适应律驱动,用于逼近最优虚拟控制器和实际控制器。证明了闭环系统中的所有信号都是有界的,并且输出跟踪误差收敛到原点的一个可调邻域,该邻域不受所提出的NSDF的影响。给出了两个仿真例子,说明了所提方案的有效性。