基于神经网络的仿射非线性离散时间系统的零和二人博弈理论公式。

Zero-sum two-player game theoretic formulation of affine nonlinear discrete-time systems using neural networks.

出版信息

IEEE Trans Cybern. 2013 Dec;43(6):1641-55. doi: 10.1109/TSMCB.2012.2227253.

DOI:10.1109/TSMCB.2012.2227253

Abstract

In this paper, the nearly optimal solution for discrete-time (DT) affine nonlinear control systems in the presence of partially unknown internal system dynamics and disturbances is considered. The approach is based on successive approximate solution of the Hamilton-Jacobi-Isaacs (HJI) equation, which appears in optimal control. Successive approximation approach for updating control and disturbance inputs for DT nonlinear affine systems are proposed. Moreover, sufficient conditions for the convergence of the approximate HJI solution to the saddle point are derived, and an iterative approach to approximate the HJI equation using a neural network (NN) is presented. Then, the requirement of full knowledge of the internal dynamics of the nonlinear DT system is relaxed by using a second NN online approximator. The result is a closed-loop optimal NN controller via offline learning. A numerical example is provided illustrating the effectiveness of the approach.

摘要

本文针对存在部分未知内部系统动态和干扰的离散时间（DT）仿射非线性控制系统，考虑了几乎最优解。该方法基于最优控制中出现的 Hamilton-Jacobi-Isaacs（HJI）方程的连续近似解。针对 DT 非线性仿射系统的控制和干扰输入的更新，提出了连续近似方法。此外，还推导出了近似 HJI 解收敛到鞍点的充分条件，并提出了一种使用神经网络（NN）近似 HJI 方程的迭代方法。然后，通过使用第二个在线 NN 逼近器，放宽了对非线性 DT 系统内部动态的完全了解的要求。结果是通过离线学习获得的闭环最优 NN 控制器。通过数值示例说明了该方法的有效性。

相似文献

Zero-sum two-player game theoretic formulation of affine nonlinear discrete-time systems using neural networks.基于神经网络的仿射非线性离散时间系统的零和二人博弈理论公式。

IEEE Trans Cybern. 2013 Dec;43(6):1641-55. doi: 10.1109/TSMCB.2012.2227253.

Online adaptive policy learning algorithm for H∞ state feedback control of unknown affine nonlinear discrete-time systems.用于未知仿射非线性离散时间系统 H∞状态反馈控制的在线自适应策略学习算法。

IEEE Trans Cybern. 2014 Dec;44(12):2706-18. doi: 10.1109/TCYB.2014.2313915. Epub 2014 Jul 28.

Generalized hamilton-jacobi-bellman formulation -based neural network control of affine nonlinear discrete-time systems.基于广义哈密顿-雅可比-贝尔曼公式的仿射非线性离散时间系统神经网络控制

IEEE Trans Neural Netw. 2008 Jan;19(1):90-106. doi: 10.1109/TNN.2007.900227.

Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence.使用具有收敛性证明的离线训练神经网络对未知仿射非线性离散时间系统进行最优控制。

Neural Netw. 2009 Jul-Aug;22(5-6):851-60. doi: 10.1016/j.neunet.2009.06.014. Epub 2009 Jul 1.

Optimization of sampling intervals for tracking control of nonlinear systems: A game theoretic approach.优化非线性系统跟踪控制的采样间隔：一种博弈论方法。

Neural Netw. 2019 Jun;114:78-90. doi: 10.1016/j.neunet.2019.02.008. Epub 2019 Mar 8.

Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H∞ control.基于神经网络的在线同时策略更新算法，用于解决非线性 H∞ 控制中的 HJI 方程。

IEEE Trans Neural Netw Learn Syst. 2012 Dec;23(12):1884-95. doi: 10.1109/TNNLS.2012.2217349.

Reinforcement-learning-based dual-control methodology for complex nonlinear discrete-time systems with application to spark engine EGR operation.基于强化学习的复杂非线性离散时间系统双控制方法及其在火花发动机废气再循环操作中的应用

IEEE Trans Neural Netw. 2008 Aug;19(8):1369-88. doi: 10.1109/TNN.2008.2000452.

Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof.使用近似动态规划的离散时间非线性HJB解：收敛性证明

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):943-9. doi: 10.1109/TSMCB.2008.926614.

Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems.基于神经网络的不确定仿射非线性离散时间系统有限时域最优控制。

IEEE Trans Neural Netw Learn Syst. 2015 Mar;26(3):486-99. doi: 10.1109/TNNLS.2014.2315646.

Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks.基于强化学习的线性参数化神经网络对非仿射非线性离散时间系统的控制

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):994-1001. doi: 10.1109/TSMCB.2008.926607.

引用本文的文献

Robust Trajectory Tracking Control for Continuous-Time Nonlinear Systems with State Constraints and Uncertain Disturbances.具有状态约束和不确定干扰的连续时间非线性系统的鲁棒轨迹跟踪控制

Entropy (Basel). 2022 Jun 11;24(6):816. doi: 10.3390/e24060816.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于神经网络的仿射非线性离散时间系统的零和二人博弈理论公式。

Zero-sum two-player game theoretic formulation of affine nonlinear discrete-time systems using neural networks.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献