• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过值迭代实现具有稳定性保证的离线和在线自适应评判控制设计

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration.

作者信息

Ha Mingming, Wang Ding, Liu Derong

出版信息

IEEE Trans Cybern. 2022 Dec;52(12):13262-13274. doi: 10.1109/TCYB.2021.3107801. Epub 2022 Nov 18.

DOI:10.1109/TCYB.2021.3107801
PMID:34516384
Abstract

This article is concerned with the stability of the closed-loop system using various control policies generated by value iteration. Some stability properties involving admissibility criteria, the attraction domain, and so forth, are investigated. An offline integrated value iteration (VI) scheme with a stability guarantee is developed by combining the advantages of VI and policy iteration, which is convenient to obtain admissible control policies. Also, based on the concept of attraction domain, an online adaptive dynamic programming algorithm using immature control policies is developed. Remarkably, it is ensured that the state trajectory under the online algorithm converges to the origin. Particularly, for linear systems, the online ADP algorithm with a general scheme possesses more enhanced stability property. The theoretical results reveal that the stability of the linear system can be guaranteed even if the control policy sequence includes finite unstable elements. The numerical results verify the effectiveness of the present algorithms.

摘要

本文关注使用由值迭代生成的各种控制策略的闭环系统的稳定性。研究了一些涉及可容许性准则、吸引域等的稳定性性质。通过结合值迭代和策略迭代的优点,开发了一种具有稳定性保证的离线集成值迭代(VI)方案,该方案便于获得可容许的控制策略。此外,基于吸引域的概念,开发了一种使用不成熟控制策略的在线自适应动态规划算法。值得注意的是,确保了在线算法下的状态轨迹收敛到原点。特别地,对于线性系统,具有一般方案的在线ADP算法具有更强的稳定性性质。理论结果表明,即使控制策略序列包含有限个不稳定元素,也能保证线性系统的稳定性。数值结果验证了本文算法的有效性。

相似文献

1
Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration.通过值迭代实现具有稳定性保证的离线和在线自适应评判控制设计
IEEE Trans Cybern. 2022 Dec;52(12):13262-13274. doi: 10.1109/TCYB.2021.3107801. Epub 2022 Nov 18.
2
Stability and Admissibility Analysis for Zero-Sum Games Under General Value Iteration Formulation.一般值迭代公式下零和博弈的稳定性与可容许性分析
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8707-8718. doi: 10.1109/TNNLS.2022.3152268. Epub 2023 Oct 27.
3
Novel Discounted Adaptive Critic Control Designs With Accelerated Learning Formulation.
IEEE Trans Cybern. 2024 May;54(5):3003-3016. doi: 10.1109/TCYB.2022.3233593. Epub 2024 Apr 16.
4
Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.离散时间局部值迭代自适应动态规划:可容许性和终止分析。
IEEE Trans Neural Netw Learn Syst. 2017 Nov;28(11):2490-2502. doi: 10.1109/TNNLS.2016.2593743.
5
Data-Based Optimal Switching and Control With Admissibility Guaranteed Q-Learning.基于数据的具有可接受性保证的Q学习最优切换与控制
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):5963-5973. doi: 10.1109/TNNLS.2024.3405739. Epub 2025 Apr 4.
6
A Novel Value Iteration Scheme With Adjustable Convergence Rate.一种具有可调收敛速率的新型值迭代方案。
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7430-7442. doi: 10.1109/TNNLS.2022.3143527. Epub 2023 Oct 6.
7
Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.基于策略迭代的自适应动态规划算法的多人非零和离散时间博弈。
IEEE Trans Cybern. 2017 Oct;47(10):3331-3340. doi: 10.1109/TCYB.2016.2611613. Epub 2016 Oct 3.
8
Advanced Optimal Tracking Control With Stability Guarantee via Novel Value Learning Formulation.基于新型价值学习公式的具有稳定性保证的先进最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8254-8265. doi: 10.1109/TNNLS.2022.3226518. Epub 2024 Jun 3.
9
Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.值迭代自适应动态规划在离散时间非线性系统最优控制中的应用。
IEEE Trans Cybern. 2016 Mar;46(3):840-53. doi: 10.1109/TCYB.2015.2492242. Epub 2015 Nov 2.
10
Improved value iteration for neural-network-based stochastic optimal control design.基于神经网络的随机最优控制设计的改进价值迭代。
Neural Netw. 2020 Apr;124:280-295. doi: 10.1016/j.neunet.2020.01.004. Epub 2020 Jan 28.