Department of Neurobiology, Weizmann Institute of Science, Rehovot, 76100, Israel.
Neural Netw. 2012 Aug;32:119-29. doi: 10.1016/j.neunet.2012.02.024. Epub 2012 Feb 14.
A curious agent acts so as to optimize its learning about itself and its environment, without external supervision. We present a model of hierarchical curiosity loops for such an autonomous active learning agent, whereby each loop selects the optimal action that maximizes the agent's learning of sensory-motor correlations. The model is based on rewarding the learner's prediction errors in an actor-critic reinforcement learning (RL) paradigm. Hierarchy is achieved by utilizing previously learned motor-sensory mapping, which enables the learning of other mappings, thus increasing the extent and diversity of knowledge and skills. We demonstrate the relevance of this architecture to active sensing using the well-studied vibrissae (whiskers) system, where rodents acquire sensory information by virtue of repeated whisker movements. We show that hierarchical curiosity loops starting from optimally learning the internal models of whisker motion and then extending to object localization result in free-air whisking and object palpation, respectively.
一个好奇的主体会在没有外部监督的情况下,自主地优化自身和环境的学习。我们提出了一种用于这种自主式主动学习主体的分层好奇循环模型,其中每个循环选择最优的动作,以最大限度地提高主体对感觉运动相关性的学习。该模型基于在强化学习(RL)范例中奖励学习者的预测误差。通过利用先前学习的运动-感觉映射,实现了层次结构,从而能够学习其他映射,从而增加知识和技能的范围和多样性。我们使用经过充分研究的触须(胡须)系统来证明这种架构对主动感知的相关性,在该系统中,啮齿动物通过反复的胡须运动来获取感官信息。我们表明,从最优地学习胡须运动的内部模型开始,然后扩展到目标定位的分层好奇循环,分别导致自由空气胡须拍打和物体触摸。