Qin Chunbin, Jiang Kaijun, Zhang Jishi, Zhu Tianzeng
School of Artificial Intelligence, Henan University, Zhengzhou 450000, China.
School of Software, Henan University, Kaifeng 475000, China.
Entropy (Basel). 2023 Jul 24;25(7):1101. doi: 10.3390/e25071101.
In this paper, the safe optimal control method for continuous-time (CT) nonlinear safety-critical systems with asymmetric input constraints and unmatched disturbances based on the adaptive dynamic programming (ADP) is investigated. Initially, a new non-quadratic form function is implemented to effectively handle the asymmetric input constraints. Subsequently, the safe optimal control problem is transformed into a two-player zero-sum game (ZSG) problem to suppress the influence of unmatched disturbances, and a new Hamilton-Jacobi-Isaacs (HJI) equation is introduced by integrating the control barrier function (CBF) with the cost function to penalize unsafe behavior. Moreover, a damping factor is embedded in the CBF to balance safety and optimality. To obtain a safe optimal controller, only one critic neural network (CNN) is utilized to tackle the complex HJI equation, leading to a decreased computational load in contrast to the utilization of the conventional actor-critic network. Then, the system state and the parameters of the CNN are uniformly ultimately bounded (UUB) through the application of the Lyapunov stability method. Lastly, two examples are presented to confirm the efficacy of the presented approach.
本文研究了基于自适应动态规划(ADP)的具有非对称输入约束和不匹配干扰的连续时间(CT)非线性安全关键系统的安全最优控制方法。首先,采用一种新的非二次型函数来有效处理非对称输入约束。随后,将安全最优控制问题转化为两人零和博弈(ZSG)问题以抑制不匹配干扰的影响,并通过将控制障碍函数(CBF)与成本函数相结合引入一个新的哈密顿-雅可比-艾萨克斯(HJI)方程来惩罚不安全行为。此外,在CBF中嵌入一个阻尼因子以平衡安全性和最优性。为了获得安全最优控制器,仅使用一个评判神经网络(CNN)来求解复杂的HJI方程,与使用传统的动作-评判网络相比,这降低了计算量。然后,通过应用李雅普诺夫稳定性方法使系统状态和CNN的参数一致最终有界(UUB)。最后,给出两个例子以证实所提方法的有效性。