人类和深度强化学习智能体中的目标导向导航依赖于基于向量和基于转换的策略的自适应混合。

Goal-directed navigation in humans and deep reinforcement learning agents relies on an adaptive mix of vector-based and transition-based strategies.

作者信息

Lan Denis C L, Hunt Laurence T, Summerfield Christopher

机构信息

Department of Experimental Psychology, Medical Sciences Division, University of Oxford, Oxford, United Kingdom.

出版信息

PLoS Biol. 2025 Jul 29;23(7):e3003296. doi: 10.1371/journal.pbio.3003296. eCollection 2025 Jul.

DOI:10.1371/journal.pbio.3003296

PMID:40729396

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12324678/

Abstract

Much has been learned about the cognitive and neural mechanisms by which humans and other animals navigate to reach their goals. However, most studies have involved a single, well-learned environment. By contrast, real-world wayfinding often occurs in unfamiliar settings, requiring people to combine memories of landmark locations with on-the-fly information about transitions between adjacent states. Here, we studied the strategies that support human navigation in wholly novel environments. We found that during goal-directed navigation, people use a mix of strategies, adaptively deploying both associations between proximal states (state transitions) and directions between distal landmarks (vectors) at stereotyped points on a journey. Deep neural networks meta-trained with reinforcement learning to find the shortest path to goal exhibited near-identical strategies, and in doing so, developed units specialized for the implementation of vector- and state transition-based strategies. These units exhibited response patterns and representational geometries that resemble those previously found in mammalian navigational systems. Overall, our results suggest that effective navigation in novel environments relies on an adaptive mix of state transition- and vector-based strategies, supported by different modes of representing the environment in the brain.

摘要

关于人类和其他动物为实现目标而导航的认知和神经机制，我们已经了解了很多。然而，大多数研究都涉及单一的、熟悉的环境。相比之下，现实世界中的寻路通常发生在不熟悉的环境中，这就要求人们将地标位置的记忆与相邻状态之间转换的即时信息结合起来。在这里，我们研究了支持人类在全新环境中导航的策略。我们发现，在目标导向的导航过程中，人们会使用多种策略，在旅程中的固定点上自适应地运用近端状态之间的关联（状态转换）和远端地标之间的方向（向量）。通过强化学习进行元训练以找到到达目标最短路径的深度神经网络表现出几乎相同的策略，并且在这样做的过程中，开发出了专门用于实施基于向量和状态转换策略的单元。这些单元表现出的反应模式和表征几何结构与先前在哺乳动物导航系统中发现的相似。总体而言，我们的结果表明，在新环境中的有效导航依赖于基于状态转换和向量的策略的自适应组合，并由大脑中表征环境的不同模式提供支持。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人类和深度强化学习智能体中的目标导向导航依赖于基于向量和基于转换的策略的自适应混合。

Goal-directed navigation in humans and deep reinforcement learning agents relies on an adaptive mix of vector-based and transition-based strategies.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

人类和深度强化学习智能体中的目标导向导航依赖于基于向量和基于转换的策略的自适应混合。

Goal-directed navigation in humans and deep reinforcement learning agents relies on an adaptive mix of vector-based and transition-based strategies.

作者信息

机构信息

出版信息

相似文献

本文引用的文献