DeepMind, London, UK.
Department of Cell and Developmental Biology, University College London, London, UK.
Nature. 2018 May;557(7705):429-433. doi: 10.1038/s41586-018-0102-6. Epub 2018 May 9.
Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex . Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space and is critical for integrating self-motion (path integration) and planning direct trajectories to goals (vector-based navigation). Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types . We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments-optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.
深度神经网络在从物体识别到围棋等复杂游戏等领域取得了令人瞩目的成就。然而,对于人工智能代理来说,导航仍然是一个巨大的挑战,通过强化学习训练的深度神经网络无法与哺乳动物的空间行为相媲美,而哺乳动物的空间行为是由内嗅皮层中的网格细胞支持的。网格细胞被认为提供了一种多尺度周期性表示,作为空间编码的度量标准,对于整合自身运动(路径整合)和规划到目标的直接轨迹(基于向量的导航)至关重要。在这里,我们着手利用网格细胞的计算功能开发一种具有类似哺乳动物导航能力的深度强化学习代理。我们首先训练一个递归网络来执行路径整合,从而产生类似于网格细胞的表示,以及其他内嗅细胞类型。然后,我们表明,这种表示为代理在具有挑战性、不熟悉和多变的环境中定位目标提供了一个有效的基础,通过深度强化学习优化了导航的主要目标。具有网格样表示的代理的性能超过了专家人类和比较代理,并且从网络中的网格样单元中推导出了基于向量的导航所需的度量量。此外,网格样表示使代理能够执行类似于哺乳动物的捷径行为。我们的研究结果表明,涌现的网格样表示为代理提供了欧几里得空间度量和相关的向量运算,为熟练导航提供了基础。因此,我们的结果支持了神经科学理论,即网格细胞对基于向量的导航至关重要,证明了后者可以与基于路径的策略相结合,以支持在具有挑战性的环境中的导航。