Robotics Research Group, Faculty 3 - Mathematics and Computer Science, Universität Bremen Bremen, Germany.
Front Neurorobot. 2013 Jul 26;7:11. doi: 10.3389/fnbot.2013.00011. eCollection 2013.
Life-long learning of reusable, versatile skills is a key prerequisite for embodied agents that act in a complex, dynamic environment and are faced with different tasks over their lifetime. We address the question of how an agent can learn useful skills efficiently during a developmental period, i.e., when no task is imposed on him and no external reward signal is provided. Learning of skills in a developmental period needs to be incremental and self-motivated. We propose a new incremental, task-independent skill discovery approach that is suited for continuous domains. Furthermore, the agent learns specific skills based on intrinsic motivation mechanisms that determine on which skills learning is focused at a given point in time. We evaluate the approach in a reinforcement learning setup in two continuous domains with complex dynamics. We show that an intrinsically motivated, skill learning agent outperforms an agent which learns task solutions from scratch. Furthermore, we compare different intrinsic motivation mechanisms and how efficiently they make use of the agent's developmental period.
终身学习可重复使用、多功能的技能是体现代理在复杂、动态的环境中行动并在其一生中面临不同任务的关键前提。我们解决了代理如何在发展期间高效学习有用技能的问题,即在没有任务强加给他且没有外部奖励信号提供的情况下。发展期间的技能学习需要是增量式的和自我激励的。我们提出了一种新的增量式、与任务无关的技能发现方法,该方法适用于连续域。此外,代理基于内在激励机制学习特定技能,这些机制决定了在给定时间点上学习哪些技能。我们在两个具有复杂动态的连续域中的强化学习设置中评估该方法。我们表明,具有内在动机的技能学习代理优于从头开始学习任务解决方案的代理。此外,我们比较了不同的内在激励机制以及它们如何有效地利用代理的发展阶段。