Miyamoto Hiroyuki, Morimoto Jun, Doya Kenji, Kawato Mitsuo
Kawato Dynamic Brain Project, Japan Science and Technology Corporation, Kyoto, Japan.
Neural Netw. 2004 Apr;17(3):299-305. doi: 10.1016/j.neunet.2003.11.004.
In this paper, we propose a new learning framework for motor control. This framework consists of two components: reinforcement learning and via-point representation. In the field of motor control, conventional reinforcement learning has been used to acquire control sequences such as cart-pole or stand-up robot control. Recently, researchers have become interested in hierarchical architecture, such as multiple levels, and multiple temporal and spatial scales. Our new framework contains two levels of hierarchical architecture. The higher level is implemented using via-point representation, which corresponds to macro-actions or multiple time scales. The lower level is implemented using a trajectory generator that produces primitive actions. Our framework can modify the ongoing movement by means of temporally localized via-points and trajectory generation. Successful results are obtained in computer simulation of the cart-pole swing up task.
在本文中,我们提出了一种用于运动控制的新学习框架。该框架由两个部分组成:强化学习和过点表示。在运动控制领域,传统的强化学习已被用于获取诸如推车-摆杆或站立机器人控制等控制序列。最近,研究人员对层次结构产生了兴趣,例如多个层次以及多个时间和空间尺度。我们的新框架包含两个层次的层次结构。较高层次使用过点表示来实现,这对应于宏观动作或多个时间尺度。较低层次使用生成原始动作的轨迹生成器来实现。我们的框架可以通过时间局部化的过点和轨迹生成来修改正在进行的运动。在推车-摆杆摆动任务的计算机模拟中获得了成功的结果。