Learning Tetris using the noisy cross-entropy method.

Suppr

超能文献

作者信息

Szita István, Lörincz András

出版信息

Neural Comput. 2006 Dec;18(12):2936-41. doi: 10.1162/neco.2006.18.12.2936.

DOI:10.1162/neco.2006.18.12.2936

PMID:17052153

Abstract

The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.

摘要

相似文献

Learning Tetris using the noisy cross-entropy method.

Neural Comput. 2006 Dec;18(12):2936-41. doi: 10.1162/neco.2006.18.12.2936.

Reinforcement learning of motor skills with policy gradients.基于策略梯度的运动技能强化学习。

Neural Netw. 2008 May;21(4):682-97. doi: 10.1016/j.neunet.2008.02.003. Epub 2008 Apr 26.

Ensemble algorithms in reinforcement learning.强化学习中的集成算法。

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):930-6. doi: 10.1109/TSMCB.2008.920231.

A parameter control method in reinforcement learning to rapidly follow unexpected environmental changes.一种强化学习中用于快速跟踪意外环境变化的参数控制方法。

Biosystems. 2004 Nov;77(1-3):109-17. doi: 10.1016/j.biosystems.2004.05.001.

Reinforcement learning in continuous time and space: interference and not ill conditioning is the main problem when using distributed function approximators.连续时间和空间中的强化学习：使用分布式函数逼近器时，主要问题是干扰而非病态。

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):950-6. doi: 10.1109/TSMCB.2008.921000.

Optimal instruments and models for noisy chaos.用于噪声混沌的最优仪器和模型。

Chaos. 2007 Dec;17(4):043127. doi: 10.1063/1.2818152.

Fast computation of approximate entropy.近似熵的快速计算

Comput Methods Programs Biomed. 2008 Jul;91(1):48-54. doi: 10.1016/j.cmpb.2008.02.008.

Improved Adaptive-Reinforcement Learning Control for morphing unmanned air vehicles.用于变形无人机的改进自适应强化学习控制

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):1014-20. doi: 10.1109/TSMCB.2008.922018.

Robust reinforcement learning.稳健强化学习

Neural Comput. 2005 Feb;17(2):335-59. doi: 10.1162/0899766053011528.

ECG compression using uniform scalar dead-zone quantization and conditional entropy coding.使用均匀标量死区量化和条件熵编码的心电图压缩

Med Eng Phys. 2008 May;30(4):523-30. doi: 10.1016/j.medengphy.2007.06.008. Epub 2007 Aug 10.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验