Suppr超能文献

基于模型的降维强化学习。

Model-based reinforcement learning with dimension reduction.

机构信息

Department of Computer Science, The University of Tokyo, Japan.

Department of Brain Robot Interface, ATR Computational Neuroscience Laboratory, Japan.

出版信息

Neural Netw. 2016 Dec;84:1-16. doi: 10.1016/j.neunet.2016.08.005. Epub 2016 Aug 24.

Abstract

The goal of reinforcement learning is to learn an optimal policy which controls an agent to acquire the maximum cumulative reward. The model-based reinforcement learning approach learns a transition model of the environment from data, and then derives the optimal policy using the transition model. However, learning an accurate transition model in high-dimensional environments requires a large amount of data which is difficult to obtain. To overcome this difficulty, in this paper, we propose to combine model-based reinforcement learning with the recently developed least-squares conditional entropy (LSCE) method, which simultaneously performs transition model estimation and dimension reduction. We also further extend the proposed method to imitation learning scenarios. The experimental results show that policy search combined with LSCE performs well for high-dimensional control tasks including real humanoid robot control.

摘要

强化学习的目标是学习最优策略,该策略控制智能体以获得最大累积奖励。基于模型的强化学习方法从数据中学习环境的转移模型,然后使用转移模型推导出最优策略。然而,在高维环境中学习准确的转移模型需要大量难以获得的数据。为了克服这一困难,在本文中,我们提出将基于模型的强化学习与最近开发的最小二乘条件熵(LSCE)方法相结合,该方法同时执行转移模型估计和降维。我们还进一步将提出的方法扩展到模仿学习场景。实验结果表明,策略搜索与 LSCE 结合可很好地应用于高维控制任务,包括真实人形机器人控制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验