Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, United Kingdom.
Research Department of Cell and Developmental Biology, University College London, London, United Kingdom.
Elife. 2023 Mar 16;12:e80663. doi: 10.7554/eLife.80663.
The predictive map hypothesis is a promising candidate principle for hippocampal function. A favoured formalisation of this hypothesis, called the successor representation, proposes that each place cell encodes the expected state occupancy of its target location in the near future. This predictive framework is supported by behavioural as well as electrophysiological evidence and has desirable consequences for both the generalisability and efficiency of reinforcement learning algorithms. However, it is unclear how the successor representation might be learnt in the brain. Error-driven temporal difference learning, commonly used to learn successor representations in artificial agents, is not known to be implemented in hippocampal networks. Instead, we demonstrate that spike-timing dependent plasticity (STDP), a form of Hebbian learning, acting on temporally compressed trajectories known as 'theta sweeps', is sufficient to rapidly learn a close approximation to the successor representation. The model is biologically plausible - it uses spiking neurons modulated by theta-band oscillations, diffuse and overlapping place cell-like state representations, and experimentally matched parameters. We show how this model maps onto known aspects of hippocampal circuitry and explains substantial variance in the temporal difference successor matrix, consequently giving rise to place cells that demonstrate experimentally observed successor representation-related phenomena including backwards expansion on a 1D track and elongation near walls in 2D. Finally, our model provides insight into the observed topographical ordering of place field sizes along the dorsal-ventral axis by showing this is necessary to prevent the detrimental mixing of larger place fields, which encode longer timescale successor representations, with more fine-grained predictions of spatial location.
预测图假说(Predictive map hypothesis)是一种有前途的海马体功能原理候选理论。该假说的一个流行形式化表示,即后继表示(successor representation),提出每个位置细胞编码其目标位置在不久的将来的预期状态占据。该预测框架得到了行为和电生理证据的支持,对强化学习算法的通用性和效率都有理想的影响。然而,尚不清楚后继表示在大脑中是如何学习的。错误驱动的时间差分学习(Error-driven temporal difference learning),通常用于在人工代理中学习后继表示,据知在海马体网络中并未实现。相反,我们证明了尖峰时间依赖可塑性(Spike-timing dependent plasticity,STDP),一种赫布学习(Hebbian learning)的形式,作用于称为“theta 扫掠(theta sweeps)”的时间压缩轨迹上,足以快速学习到后继表示的近似值。该模型具有生物学合理性——它使用由 theta 波段振荡调制的尖峰神经元、弥散且重叠的位置细胞样状态表示,以及与实验匹配的参数。我们展示了该模型如何映射到海马体回路的已知方面,并解释了时间差分后继矩阵中的大量方差,从而导致产生了表现出与后继表示相关现象的位置细胞,包括在一维轨迹上向后扩展和在二维中靠近墙壁的伸长。最后,我们的模型通过表明这是防止更大的位置场(其编码更长时间尺度的后继表示)与更精细的空间位置预测之间有害混合的必要条件,为沿背腹轴观察到的位置场大小的地形排序提供了深入的了解。