值迭代自适应动态规划在离散时间非线性系统最优控制中的应用。

Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.

出版信息

IEEE Trans Cybern. 2016 Mar;46(3):840-53. doi: 10.1109/TCYB.2015.2492242. Epub 2015 Nov 2.

DOI:10.1109/TCYB.2015.2492242

Abstract

In this paper, a value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon undiscounted optimal control problems for discrete-time nonlinear systems. The present value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize the algorithm. A novel convergence analysis is developed to guarantee that the iterative value function converges to the optimal performance index function. Initialized by different initial functions, it is proven that the iterative value function will be monotonically nonincreasing, monotonically nondecreasing, or nonmonotonic and will converge to the optimum. In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms. It is emphasized that new termination criteria are established to guarantee the effectiveness of the iterative control laws. Neural networks are used to approximate the iterative value function and compute the iterative control law, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.

摘要

本文提出了一种用于求解离散时间非线性系统无限时域无折扣最优控制问题的价值迭代自适应动态规划（ADP）算法。现值迭代 ADP 算法允许任意正定半定函数初始化算法。本文开发了一种新的收敛性分析方法，以保证迭代价值函数收敛到最优性能指标函数。通过不同的初始函数初始化，证明了迭代价值函数将单调非增、单调非减或非单调，并收敛到最优值。本文首次为价值迭代算法开发了迭代控制律的可接受性特性。强调建立新的终止准则以保证迭代控制律的有效性。分别使用神经网络来近似迭代价值函数和计算迭代控制律，以方便迭代 ADP 算法的实现。最后，给出了两个仿真示例来说明所提出方法的性能。

相似文献

Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.值迭代自适应动态规划在离散时间非线性系统最优控制中的应用。

IEEE Trans Cybern. 2016 Mar;46(3):840-53. doi: 10.1109/TCYB.2015.2492242. Epub 2015 Nov 2.

Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.离散时间局部值迭代自适应动态规划：可容许性和终止分析。

IEEE Trans Neural Netw Learn Syst. 2017 Nov;28(11):2490-2502. doi: 10.1109/TNNLS.2016.2593743.

Finite-approximation-error-based discrete-time iterative adaptive dynamic programming.基于有限逼近误差的离散时间迭代自适应动态规划。

IEEE Trans Cybern. 2014 Dec;44(12):2820-33. doi: 10.1109/TCYB.2014.2354377. Epub 2014 Sep 26.

Finite-Approximation-Error-Based Optimal Control Approach for Discrete-Time Nonlinear Systems.基于有限逼近误差的离散时间非线性系统最优控制方法。

IEEE Trans Cybern. 2013 Apr;43(2):779-89. doi: 10.1109/TSMCB.2012.2216523. Epub 2013 Mar 7.

Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems.策略迭代自适应动态规划算法用于离散时间非线性系统。

IEEE Trans Neural Netw Learn Syst. 2014 Mar;25(3):621-34. doi: 10.1109/TNNLS.2013.2281663.

Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems.非仿射离散时间非线性系统的无限时域自学习最优控制。

IEEE Trans Neural Netw Learn Syst. 2015 Apr;26(4):866-79. doi: 10.1109/TNNLS.2015.2401334. Epub 2015 Mar 2.

Discrete-Time Optimal Control via Local Policy Iteration Adaptive Dynamic Programming.基于局部策略迭代自适应动态规划的离散时间最优控制。

IEEE Trans Cybern. 2017 Oct;47(10):3367-3379. doi: 10.1109/TCYB.2016.2586082. Epub 2016 Jul 18.

Continuous-Time Time-Varying Policy Iteration.连续时间时变策略迭代

IEEE Trans Cybern. 2020 Dec;50(12):4958-4971. doi: 10.1109/TCYB.2019.2926631. Epub 2020 Dec 3.

An iterative ϵ-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state.一类具有非固定初始状态的离散时间非线性系统的迭代 ϵ-最优控制方案。

Neural Netw. 2012 Aug;32:236-44. doi: 10.1016/j.neunet.2012.02.027. Epub 2012 Feb 24.

Discrete-Time Impulsive Adaptive Dynamic Programming.离散时间脉冲自适应动态规划

IEEE Trans Cybern. 2020 Oct;50(10):4293-4306. doi: 10.1109/TCYB.2019.2906694. Epub 2019 Apr 11.

引用本文的文献

Optimal Robust Control of Nonlinear Systems with Unknown Dynamics via NN Learning with Relaxed Excitation.基于松弛激励的神经网络学习实现对未知动态非线性系统的最优鲁棒控制

Entropy (Basel). 2024 Jan 14;26(1):0. doi: 10.3390/e26010072.

Robust Trajectory Tracking Control for Continuous-Time Nonlinear Systems with State Constraints and Uncertain Disturbances.具有状态约束和不确定干扰的连续时间非线性系统的鲁棒轨迹跟踪控制

Entropy (Basel). 2022 Jun 11;24(6):816. doi: 10.3390/e24060816.

The Algorithms of Distributed Learning and Distributed Estimation about Intelligent Wireless Sensor Network.智能无线传感器网络中的分布式学习和分布式估计算法。

Sensors (Basel). 2020 Feb 27;20(5):1302. doi: 10.3390/s20051302.

Adaptive Dynamic Programming-Based Multi-Sensor Scheduling for Collaborative Target Tracking in Energy Harvesting Wireless Sensor Networks.基于自适应动态规划的能量收集无线传感器网络中协作目标跟踪的多传感器调度。

Sensors (Basel). 2018 Nov 22;18(12):4090. doi: 10.3390/s18124090.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

值迭代自适应动态规划在离散时间非线性系统最优控制中的应用。

Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献