文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Wang Ding, Zhao Mingming, Ha Mingming, Qiao Junfei

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8707-8718. doi: 10.1109/TNNLS.2022.3152268. Epub 2023 Oct 27.

In this article, the general value iteration (GVI) algorithm for discrete-time zero-sum games is investigated. The theoretical analysis focuses on stability properties of the systems and also the admissibility properties of the iterative policy pair. A new criterion is established to determine the admissibility of the current policy pair. Besides, based on the admissibility criterion, the improved GVI algorithm toward zero-sum games is developed to guarantee that all iterative policy pairs are admissible if the current policy pair satisfies the criterion. On the basis of the attraction domain, we demonstrate that the state trajectory will stay in the region using the fixed or the evolving policy pair if the initial state belongs to the domain. It is emphasized that the evolving policy pair can stabilize the controlled system. These theoretical results are applied to linear and nonlinear systems via offline and online critic control design.

本文研究了离散时间零和博弈的一般值迭代（GVI）算法。理论分析集中于系统的稳定性以及迭代策略对的可容许性。建立了一个新的准则来确定当前策略对的可容许性。此外，基于该可容许性准则，开发了针对零和博弈的改进GVI算法，以确保如果当前策略对满足该准则，则所有迭代策略对都是可容许的。基于吸引域，我们证明如果初始状态属于该域，那么使用固定或演化策略对时状态轨迹将停留在该区域。需要强调的是，演化策略对可以使受控系统稳定。这些理论结果通过离线和在线评判控制设计应用于线性和非线性系统。

Wang Ding, Zhao Mingming, Ha Mingming, Qiao Junfei

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8707-8718. doi: 10.1109/TNNLS.2022.3152268. Epub 2023 Oct 27.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一般值迭代公式下零和博弈的稳定性与可容许性分析

Stability and Admissibility Analysis for Zero-Sum Games Under General Value Iteration Formulation.

作者信息

出版信息

相似文献

一般值迭代公式下零和博弈的稳定性与可容许性分析

Stability and Admissibility Analysis for Zero-Sum Games Under General Value Iteration Formulation.

作者信息

出版信息

相似文献