Rasmussen Daniel, Voelker Aaron, Eliasmith Chris
Applied Brain Research, Inc., Waterloo, ON, Canada.
Centre for Theoretical Neuroscience, University of Waterloo, Waterloo, ON, Canada.
PLoS One. 2017 Jul 6;12(7):e0180234. doi: 10.1371/journal.pone.0180234. eCollection 2017.
We develop a novel, biologically detailed neural model of reinforcement learning (RL) processes in the brain. This model incorporates a broad range of biological features that pose challenges to neural RL, such as temporally extended action sequences, continuous environments involving unknown time delays, and noisy/imprecise computations. Most significantly, we expand the model into the realm of hierarchical reinforcement learning (HRL), which divides the RL process into a hierarchy of actions at different levels of abstraction. Here we implement all the major components of HRL in a neural model that captures a variety of known anatomical and physiological properties of the brain. We demonstrate the performance of the model in a range of different environments, in order to emphasize the aim of understanding the brain's general reinforcement learning ability. These results show that the model compares well to previous modelling work and demonstrates improved performance as a result of its hierarchical ability. We also show that the model's behaviour is consistent with available data on human hierarchical RL, and generate several novel predictions.
我们开发了一种新颖的、具有生物学细节的大脑强化学习(RL)过程神经模型。该模型纳入了一系列给神经强化学习带来挑战的生物学特征,如时间扩展动作序列、涉及未知时间延迟的连续环境以及噪声/不精确计算。最重要的是,我们将该模型扩展到分层强化学习(HRL)领域,它将强化学习过程划分为不同抽象层次的动作层次结构。在这里,我们在一个捕捉大脑各种已知解剖和生理特性的神经模型中实现了分层强化学习的所有主要组件。我们在一系列不同环境中展示了该模型的性能,以强调理解大脑一般强化学习能力的目标。这些结果表明,该模型与之前的建模工作相比表现良好,并因其分层能力而展现出改进的性能。我们还表明,该模型的行为与人类分层强化学习的现有数据一致,并产生了几个新颖的预测。