Suppr超能文献

具有理论支持样本复用的广义策略改进算法

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse.

作者信息

Queeney James, Paschalidis Ioannis Ch, Cassandras Christos G

机构信息

Mitsubishi Electric Research Laboratories, Cambridge, MA 02139 USA. He performed the majority of this work while with the Division of Systems Engineering, Boston University, Boston, MA 02215 USA.

Department of Electrical and Computer Engineering and Division of Systems Engineering, Boston University, Boston, MA 02215 USA.

出版信息

IEEE Trans Automat Contr. 2025 Feb;70(2):1236-1243. doi: 10.1109/tac.2024.3454011. Epub 2024 Sep 3.

Abstract

We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a trade-off between two important deployment requirements for real-world control: (i) practical performance guarantees and (ii) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.

摘要

我们开发了一类全新的无模型深度强化学习算法,用于数据驱动的、基于学习的控制。我们的广义策略改进算法将基于策略方法的策略改进保证与样本重用的效率相结合,解决了现实世界控制中两个重要部署要求之间的权衡:(i)实际性能保证和(ii)数据效率。我们通过对广泛的模拟控制任务进行广泛的实验分析,证明了这类新算法的优势。

相似文献

1
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse.具有理论支持样本复用的广义策略改进算法
IEEE Trans Automat Contr. 2025 Feb;70(2):1236-1243. doi: 10.1109/tac.2024.3454011. Epub 2024 Sep 3.

本文引用的文献

1
Authentic Boundary Proximal Policy Optimization.真实边界近端策略优化。
IEEE Trans Cybern. 2022 Sep;52(9):9428-9438. doi: 10.1109/TCYB.2021.3051456. Epub 2022 Aug 18.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验