基于在线数据的求解未知非线性零和博弈的迭代自适应动态规划

Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data.

出版信息

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):714-725. doi: 10.1109/TNNLS.2016.2561300. Epub 2016 May 27.

DOI:10.1109/TNNLS.2016.2561300

Abstract

H control is a powerful method to solve the disturbance attenuation problems that occur in some control systems. The design of such controllers relies on solving the zero-sum game (ZSG). But in practical applications, the exact dynamics is mostly unknown. Identification of dynamics also produces errors that are detrimental to the control performance. To overcome this problem, an iterative adaptive dynamic programming algorithm is proposed in this paper to solve the continuous-time, unknown nonlinear ZSG with only online data. A model-free approach to the Hamilton-Jacobi-Isaacs equation is developed based on the policy iteration method. Control and disturbance policies and value are approximated by neural networks (NNs) under the critic-actor-disturber structure. The NN weights are solved by the least-squares method. According to the theoretical analysis, our algorithm is equivalent to a Gauss-Newton method solving an optimization problem, and it converges uniformly to the optimal solution. The online data can also be used repeatedly, which is highly efficient. Simulation results demonstrate its feasibility to solve the unknown nonlinear ZSG. When compared with other algorithms, it saves a significant amount of online measurement time.

摘要

H 控制是解决某些控制系统中出现的干扰衰减问题的一种强大方法。此类控制器的设计依赖于求解零和博弈（ZSG）。但在实际应用中，精确的动力学通常是未知的。动态识别也会产生不利于控制性能的误差。为了克服这个问题，本文提出了一种迭代自适应动态规划算法，用于仅使用在线数据解决连续时间、未知非线性 ZSG。基于策略迭代方法，开发了一种无模型的 Hamilton-Jacobi-Isaacs 方程方法。在批评者-演员-干扰者结构下，通过神经网络（NN）逼近控制和干扰策略以及价值。NN 权重通过最小二乘法求解。根据理论分析，我们的算法等效于求解优化问题的高斯-牛顿法，并且它一致收敛到最优解。在线数据也可以重复使用，效率很高。仿真结果表明了它求解未知非线性 ZSG 的可行性。与其他算法相比，它节省了大量的在线测量时间。

相似文献

Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data.

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):714-725. doi: 10.1109/TNNLS.2016.2561300. Epub 2016 May 27.

Online adaptive policy learning algorithm for H∞ state feedback control of unknown affine nonlinear discrete-time systems.

IEEE Trans Cybern. 2014 Dec;44(12):2706-18. doi: 10.1109/TCYB.2014.2313915. Epub 2014 Jul 28.

Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H∞ control.

IEEE Trans Neural Netw Learn Syst. 2012 Dec;23(12):1884-95. doi: 10.1109/TNNLS.2012.2217349.

Observer-based event-triggered control for zero-sum games of input constrained multi-player nonlinear systems.

Neural Netw. 2021 Dec;144:101-112. doi: 10.1016/j.neunet.2021.08.012. Epub 2021 Aug 25.

Optimal H tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs.

Neural Netw. 2023 Jul;164:105-114. doi: 10.1016/j.neunet.2023.04.021. Epub 2023 Apr 20.

Model-Free Adaptive Control for Unknown Nonlinear Zero-Sum Differential Game.

IEEE Trans Cybern. 2018 May;48(5):1633-1646. doi: 10.1109/TCYB.2017.2712617. Epub 2017 Jul 17.

Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game.

IEEE Trans Neural Netw Learn Syst. 2022 Feb;33(2):879-892. doi: 10.1109/TNNLS.2020.3030127. Epub 2022 Feb 3.

Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update.

IEEE Trans Neural Netw Learn Syst. 2012 Jul;23(7):1118-29. doi: 10.1109/TNNLS.2012.2196708.

Event-driven H control with critic learning for nonlinear systems.

Neural Netw. 2020 Dec;132:30-42. doi: 10.1016/j.neunet.2020.08.004. Epub 2020 Aug 20.

A policy iteration approach to online optimal control of continuous-time constrained-input systems.

ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于在线数据的求解未知非线性零和博弈的迭代自适应动态规划

Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data.

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献