通过值迭代实现具有稳定性保证的离线和在线自适应评判控制设计

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration.

作者信息

Ha Mingming, Wang Ding, Liu Derong

出版信息

IEEE Trans Cybern. 2022 Dec;52(12):13262-13274. doi: 10.1109/TCYB.2021.3107801. Epub 2022 Nov 18.

DOI:10.1109/TCYB.2021.3107801

Abstract

This article is concerned with the stability of the closed-loop system using various control policies generated by value iteration. Some stability properties involving admissibility criteria, the attraction domain, and so forth, are investigated. An offline integrated value iteration (VI) scheme with a stability guarantee is developed by combining the advantages of VI and policy iteration, which is convenient to obtain admissible control policies. Also, based on the concept of attraction domain, an online adaptive dynamic programming algorithm using immature control policies is developed. Remarkably, it is ensured that the state trajectory under the online algorithm converges to the origin. Particularly, for linear systems, the online ADP algorithm with a general scheme possesses more enhanced stability property. The theoretical results reveal that the stability of the linear system can be guaranteed even if the control policy sequence includes finite unstable elements. The numerical results verify the effectiveness of the present algorithms.

摘要

本文关注使用由值迭代生成的各种控制策略的闭环系统的稳定性。研究了一些涉及可容许性准则、吸引域等的稳定性性质。通过结合值迭代和策略迭代的优点，开发了一种具有稳定性保证的离线集成值迭代（VI）方案，该方案便于获得可容许的控制策略。此外，基于吸引域的概念，开发了一种使用不成熟控制策略的在线自适应动态规划算法。值得注意的是，确保了在线算法下的状态轨迹收敛到原点。特别地，对于线性系统，具有一般方案的在线ADP算法具有更强的稳定性性质。理论结果表明，即使控制策略序列包含有限个不稳定元素，也能保证线性系统的稳定性。数值结果验证了本文算法的有效性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过值迭代实现具有稳定性保证的离线和在线自适应评判控制设计

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration.

作者信息

出版信息

相似文献

通过值迭代实现具有稳定性保证的离线和在线自适应评判控制设计

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration.

作者信息

出版信息

相似文献