• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

非线性离散时间零和博弈的演进和增量价值迭代方案。

Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games.

出版信息

IEEE Trans Cybern. 2023 Jul;53(7):4487-4499. doi: 10.1109/TCYB.2022.3198078. Epub 2023 Jun 15.

DOI:10.1109/TCYB.2022.3198078
PMID:36063514
Abstract

In this article, evolving and incremental value iteration (VI) frameworks are constructed to address the discrete-time zero-sum game problem. First, the evolving scheme means that the closed-loop system is regulated by using the evolving policy pair. During the control stage, we are committed to establishing the stability criterion in order to guarantee the availability of evolving policy pairs. Second, a novel incremental VI algorithm, which takes the historical information of the iterative process into account, is developed to solve the regulation and tracking problems for the nonlinear zero-sum game. Via introducing different incremental factors, it is highlighted that we can adjust the convergence rate of the iterative cost function sequence. Finally, two simulation examples, including linear and nonlinear systems, are conducted to demonstrate the performance and the validity of the proposed evolving and incremental VI schemes.

摘要

本文构建了演进式和增量式价值迭代(VI)框架,以解决离散时间零和博弈问题。首先,演进方案意味着闭环系统通过使用演进策略对进行调节。在控制阶段,我们致力于建立稳定性准则,以保证演进策略对的可用性。其次,开发了一种新颖的增量 VI 算法,该算法考虑了迭代过程的历史信息,用于解决非线性零和博弈的调节和跟踪问题。通过引入不同的增量因子,突出了我们可以调整迭代代价函数序列的收敛速度。最后,通过两个包括线性和非线性系统的仿真示例,验证了所提出的演进式和增量式 VI 方案的性能和有效性。

相似文献

1
Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games.非线性离散时间零和博弈的演进和增量价值迭代方案。
IEEE Trans Cybern. 2023 Jul;53(7):4487-4499. doi: 10.1109/TCYB.2022.3198078. Epub 2023 Jun 15.
2
Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate.具有可调收敛速度的离散时间非线性零和博弈的神经 Q 学习。
Neural Netw. 2024 Jul;175:106274. doi: 10.1016/j.neunet.2024.106274. Epub 2024 Mar 27.
3
Dichotomy value iteration with parallel learning design towards discrete-time zero-sum games.面向离散时间零和博弈的具有并行学习设计的二分法值迭代
Neural Netw. 2023 Oct;167:751-762. doi: 10.1016/j.neunet.2023.09.009. Epub 2023 Sep 7.
4
Stability and Admissibility Analysis for Zero-Sum Games Under General Value Iteration Formulation.一般值迭代公式下零和博弈的稳定性与可容许性分析
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8707-8718. doi: 10.1109/TNNLS.2022.3152268. Epub 2023 Oct 27.
5
A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm.一种基于贪婪HDP迭代算法的一类离散时间非线性系统的新型无限时间最优跟踪控制方案。
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):937-42. doi: 10.1109/TSMCB.2008.920269.
6
Optimal H tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs.通过新颖的自适应评价设计实现具有零平衡点的非线性系统的最优 H 跟踪控制。
Neural Netw. 2023 Jul;164:105-114. doi: 10.1016/j.neunet.2023.04.021. Epub 2023 Apr 20.
7
Advanced optimal tracking integrating a neural critic technique for asymmetric constrained zero-sum games.高级最优跟踪,整合神经批评技术,用于非对称约束零和博弈。
Neural Netw. 2024 Sep;177:106388. doi: 10.1016/j.neunet.2024.106388. Epub 2024 May 15.
8
Observer-based event-triggered control for zero-sum games of input constrained multi-player nonlinear systems.基于观测器的事件触发控制用于输入受限多玩家非线性系统的零和博弈。
Neural Netw. 2021 Dec;144:101-112. doi: 10.1016/j.neunet.2021.08.012. Epub 2021 Aug 25.
9
Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming.基于启发式动态规划的一类具有时滞的非线性离散时间系统的最优跟踪控制
IEEE Trans Neural Netw. 2011 Dec;22(12):1851-62. doi: 10.1109/TNN.2011.2172628. Epub 2011 Nov 1.
10
Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.基于策略迭代的自适应动态规划算法的多人非零和离散时间博弈。
IEEE Trans Cybern. 2017 Oct;47(10):3331-3340. doi: 10.1109/TCYB.2016.2611613. Epub 2016 Oct 3.