Suppr超能文献

解决潮汐涡轮机系统的零和控制问题:一种在线强化学习方法。

Solving the Zero-Sum Control Problem for Tidal Turbine System: An Online Reinforcement Learning Approach.

作者信息

Fang Haiyang, Zhang Maoguang, He Shuping, Luan Xiaoli, Liu Fei, Ding Zhengtao

出版信息

IEEE Trans Cybern. 2023 Dec;53(12):7635-7647. doi: 10.1109/TCYB.2022.3186886. Epub 2023 Nov 29.

Abstract

A novel completely mode-free integral reinforcement learning (CMFIRL)-based iteration algorithm is proposed in this article to compute the two-player zero-sum games and the Nash equilibrium problems, that is, the optimal control policy pairs, for tidal turbine system based on continuous-time Markov jump linear model with exact transition probability and completely unknown dynamics. First, the tidal turbine system is modeled into Markov jump linear systems, followed by a designed subsystem transformation technique to decouple the jumping modes. Then, a completely mode-free reinforcement learning algorithm is employed to address the game-coupled algebraic Riccati equations without using the information of the system dynamics, in order to reach the Nash equilibrium. The learning algorithm includes one iteration loop by updating the control policy and the disturbance policy simultaneously. Also, the exploration signal is added for motivating the system, and the convergence of the CMFIRL iteration algorithm is rigorously proved. Finally, a simulation example is given to illustrate the effectiveness and applicability of the control design approach.

摘要

本文提出了一种基于新型完全无模式积分强化学习(CMFIRL)的迭代算法,用于计算基于具有精确转移概率和完全未知动态的连续时间马尔可夫跳跃线性模型的潮汐涡轮机系统的两人零和博弈及纳什均衡问题,即最优控制策略对。首先,将潮汐涡轮机系统建模为马尔可夫跳跃线性系统,接着采用设计的子系统变换技术来解耦跳跃模式。然后,使用一种完全无模式的强化学习算法,在不使用系统动态信息的情况下求解博弈耦合代数黎卡提方程,以达到纳什均衡。该学习算法通过同时更新控制策略和干扰策略包含一个迭代循环。此外,添加探索信号以激励系统,并严格证明了CMFIRL迭代算法的收敛性。最后,给出一个仿真例子来说明控制设计方法的有效性和适用性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验