Suppr超能文献

基于自适应观测的不确定系统高效强化学习。

Adaptive Observation-Based Efficient Reinforcement Learning for Uncertain Systems.

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5492-5503. doi: 10.1109/TNNLS.2021.3070852. Epub 2022 Oct 5.

Abstract

This article develops an adaptive observation-based efficient reinforcement learning (RL) approach for systems with uncertain drift dynamics. A novel concurrent learning adaptive extended observer (CL-AEO) is first designed to jointly estimate the system state and parameter. This observer has a two-time-scale structure and does not require any additional numerical techniques to calculate the state derivative information. The idea of concurrent learning (CL) is leveraged to use the recorded data, which leads to a relaxed verifiable excitation condition for the convergence of parameter estimation. Based on the estimated state and parameter provided by the CL-AEO, a simulation of experience-based RL scheme is developed to online approximate the optimal control policy. Rigorous theoretical analysis is given to show that the practical convergence of the system state to the origin and the developed policy to the ideal optimal policy can be achieved without the persistence of excitation (PE) condition. Finally, the effectiveness and superiority of the developed methodology are demonstrated via comparative simulations.

摘要

本文为具有不确定漂移动态的系统开发了一种自适应观测的高效强化学习(RL)方法。首先设计了一种新颖的并发学习自适应扩展观测器(CL-AEO),用于联合估计系统状态和参数。该观测器具有双时间尺度结构,不需要任何额外的数值技术来计算状态导数信息。利用并发学习(CL)的思想,使用记录的数据,这为参数估计的收敛提供了一个宽松的可验证激励条件。基于 CL-AEO 提供的估计状态和参数,开发了基于模拟的经验 RL 方案,以在线近似最优控制策略。给出了严格的理论分析,证明了在没有持续激励(PE)条件的情况下,系统状态能够实际收敛到原点,所开发的策略能够收敛到理想最优策略。最后,通过比较仿真验证了所提出方法的有效性和优越性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验