Botvinick Matthew M, Niv Yael, Barto Andew G
Princeton Neuroscience Institute, Department of Psychology, Princeton University, Green Hall, Princeton, NJ 08540, United States.
Princeton Neuroscience Institute, Department of Psychology, Princeton University, Green Hall, Princeton, NJ 08540, United States.
Cognition. 2009 Dec;113(3):262-280. doi: 10.1016/j.cognition.2008.08.011. Epub 2008 Oct 15.
Research on human and animal behavior has long emphasized its hierarchical structure-the divisibility of ongoing behavior into discrete tasks, which are comprised of subtask sequences, which in turn are built of simple actions. The hierarchical structure of behavior has also been of enduring interest within neuroscience, where it has been widely considered to reflect prefrontal cortical functions. In this paper, we reexamine behavioral hierarchy and its neural substrates from the point of view of recent developments in computational reinforcement learning. Specifically, we consider a set of approaches known collectively as hierarchical reinforcement learning, which extend the reinforcement learning paradigm by allowing the learning agent to aggregate actions into reusable subroutines or skills. A close look at the components of hierarchical reinforcement learning suggests how they might map onto neural structures, in particular regions within the dorsolateral and orbital prefrontal cortex. It also suggests specific ways in which hierarchical reinforcement learning might provide a complement to existing psychological models of hierarchically structured behavior. A particularly important question that hierarchical reinforcement learning brings to the fore is that of how learning identifies new action routines that are likely to provide useful building blocks in solving a wide range of future problems. Here and at many other points, hierarchical reinforcement learning offers an appealing framework for investigating the computational and neural underpinnings of hierarchically structured behavior.
对人类和动物行为的研究长期以来一直强调其层次结构——将正在进行的行为划分为离散任务,这些任务由子任务序列组成,而子任务序列又由简单动作构建而成。行为的层次结构在神经科学领域也一直备受关注,在该领域它被广泛认为反映了前额叶皮质的功能。在本文中,我们从计算强化学习的最新进展角度重新审视行为层次结构及其神经基础。具体而言,我们考虑一组统称为分层强化学习的方法,这些方法通过允许学习智能体将动作聚合为可重复使用的子例程或技能来扩展强化学习范式。仔细研究分层强化学习的组成部分,能揭示它们如何映射到神经结构上,特别是背外侧和眶额前额叶皮质内的区域。这也提示了分层强化学习可能以特定方式对现有的层次结构化行为心理模型起到补充作用。分层强化学习凸显的一个特别重要的问题是,学习如何识别新的动作例程,这些例程可能为解决未来广泛问题提供有用的构建模块。在这一点以及许多其他方面,分层强化学习为研究层次结构化行为的计算和神经基础提供了一个有吸引力的框架。