Keck Janis, Barry Caswell, Doeller Christian F, Jost Jürgen
Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.
Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
PLoS Comput Biol. 2025 Jun 23;21(6):e1013056. doi: 10.1371/journal.pcbi.1013056. eCollection 2025 Jun.
In spatial cognition, the Successor Representation (SR) from reinforcement learning provides a compelling candidate of how predictive representations are used to encode space. In particular, hippocampal place cells are hypothesized to encode the SR. Here, we investigate how varying the temporal symmetry in learning rules influences those representations. To this end, we use a simple local learning rule which can be made insensitive to the temporal order. We analytically find that a symmetric learning rule results in a successor representation under a symmetrized version of the experienced transition structure. We then apply this rule to a two-layer neural network model loosely resembling hippocampal subfields CA3 - with a symmetric learning rule and recurrent weights - and CA1 - with an asymmetric learning rule and no recurrent weights. Here, when exposed repeatedly to a linear track, neurons in our model in CA3 show less shift of the centre of mass than those in CA1, in line with existing empirical findings. Investigating the functional benefits of such symmetry, we employ a simple reinforcement learning agent which may learn symmetric or classical successor representations. Here, we find that using a symmetric learning rule yields representations which afford better generalization, when the agent is probed to navigate to a new target without relearning the SR. This effect is reversed when the state space is not symmetric anymore. Thus, our results hint at a potential benefit of the inductive bias afforded by symmetric learning rules in areas employed in spatial navigation, where there naturally is a symmetry in the state space.
在空间认知中,强化学习中的后继表示(SR)为预测性表示如何用于编码空间提供了一个引人注目的候选方案。具体而言,海马体位置细胞被假定为编码SR。在此,我们研究学习规则中时间对称性的变化如何影响这些表示。为此,我们使用一种简单的局部学习规则,该规则可以对时间顺序不敏感。我们通过分析发现,对称学习规则在经历的转移结构的对称版本下会产生后继表示。然后,我们将此规则应用于一个两层神经网络模型,该模型大致类似于海马体子区域CA3(具有对称学习规则和循环权重)和CA1(具有非对称学习规则且无循环权重)。在此,当反复暴露于线性轨道时,我们模型中CA3的神经元质心移动比CA1中的神经元小,这与现有实证结果一致。为了研究这种对称性的功能益处,我们采用了一个简单的强化学习智能体,它可以学习对称或经典的后继表示。在此,我们发现,当探测智能体导航到新目标而无需重新学习SR时,使用对称学习规则会产生具有更好泛化能力的表示。当状态空间不再对称时,这种效果会逆转。因此,我们的结果暗示了对称学习规则在空间导航所涉及区域中提供的归纳偏差的潜在益处,在这些区域中状态空间自然存在对称性。