Fu Yuchen, Liu Quan, Ling Xionghong, Cui Zhiming
Suzhou Industrial Park Institute of Services Outsourcing, Suzhou, Jiangsu 215123, China ; School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China.
School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China.
ScientificWorldJournal. 2014 Jan 28;2014:120760. doi: 10.1155/2014/120760. eCollection 2014.
Reinforcement learning (RL) is one kind of interactive learning methods. Its main characteristics are "trial and error" and "related reward." A hierarchical reinforcement learning method based on action subrewards is proposed to solve the problem of "curse of dimensionality," which means that the states space will grow exponentially in the number of features and low convergence speed. The method can reduce state spaces greatly and choose actions with favorable purpose and efficiency so as to optimize reward function and enhance convergence speed. Apply it to the online learning in Tetris game, and the experiment result shows that the convergence speed of this algorithm can be enhanced evidently based on the new method which combines hierarchical reinforcement learning algorithm and action subrewards. The "curse of dimensionality" problem is also solved to a certain extent with hierarchical method. All the performance with different parameters is compared and analyzed as well.
强化学习(RL)是一种交互式学习方法。其主要特点是“试错”和“相关奖励”。为解决“维度诅咒”问题,即状态空间会随着特征数量呈指数增长且收敛速度较慢,提出了一种基于动作子奖励的分层强化学习方法。该方法可以大幅减少状态空间,并以良好的目的和效率选择动作,从而优化奖励函数并提高收敛速度。将其应用于俄罗斯方块游戏的在线学习中,实验结果表明,基于分层强化学习算法与动作子奖励相结合的新方法,该算法的收敛速度能得到显著提高。分层方法也在一定程度上解决了“维度诅咒”问题。同时还对不同参数下的所有性能进行了比较和分析。