Chalmers Eric, Luczak Artur, Gruber Aaron J
Department of Neuroscience, University of Lethbridge Lethbridge, AB, Canada.
Front Comput Neurosci. 2016 Dec 12;10:128. doi: 10.3389/fncom.2016.00128. eCollection 2016.
The mammalian brain is thought to use a version of Model-based Reinforcement Learning (MBRL) to guide "goal-directed" behavior, wherein animals consider goals and make plans to acquire desired outcomes. However, conventional MBRL algorithms do not fully explain animals' ability to rapidly adapt to environmental changes, or learn multiple complex tasks. They also require extensive computation, suggesting that goal-directed behavior is cognitively expensive. We propose here that key features of processing in the hippocampus support a flexible MBRL mechanism for spatial navigation that is computationally efficient and can adapt quickly to change. We investigate this idea by implementing a computational MBRL framework that incorporates features inspired by computational properties of the hippocampus: a hierarchical representation of space, "forward sweeps" through future spatial trajectories, and context-driven remapping of place cells. We find that a hierarchical abstraction of space greatly reduces the computational load (mental effort) required for adaptation to changing environmental conditions, and allows efficient scaling to large problems. It also allows abstract knowledge gained at high levels to guide adaptation to new obstacles. Moreover, a context-driven remapping mechanism allows learning and memory of multiple tasks. Simulating dorsal or ventral hippocampal lesions in our computational framework qualitatively reproduces behavioral deficits observed in rodents with analogous lesions. The framework may thus embody key features of how the brain organizes model-based RL to efficiently solve navigation and other difficult tasks.
哺乳动物的大脑被认为使用一种基于模型的强化学习(MBRL)版本来指导“目标导向”行为,即动物会考虑目标并制定计划以获取期望的结果。然而,传统的MBRL算法并不能完全解释动物快速适应环境变化或学习多个复杂任务的能力。它们还需要大量的计算,这表明目标导向行为在认知上成本很高。我们在此提出,海马体处理过程的关键特征支持一种灵活的用于空间导航的MBRL机制,该机制计算效率高且能快速适应变化。我们通过实施一个计算MBRL框架来研究这一想法,该框架纳入了受海马体计算特性启发的特征:空间的分层表示、对未来空间轨迹的“向前扫描”以及位置细胞的上下文驱动重映射。我们发现,空间的分层抽象极大地降低了适应不断变化的环境条件所需的计算负荷(脑力),并允许有效地扩展到大型问题。它还允许在高层次获得的抽象知识指导对新障碍的适应。此外,上下文驱动的重映射机制允许学习和记忆多个任务。在我们的计算框架中模拟背侧或腹侧海马体损伤定性地再现了在具有类似损伤的啮齿动物中观察到的行为缺陷。因此,该框架可能体现了大脑如何组织基于模型的强化学习以有效解决导航和其他困难任务的关键特征。