Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159, Mannheim, Germany.
Sainsbury Wellcome Centre for Neural Circuits and Behaviour, London, W1T 4JG, UK.
Nat Commun. 2020 Jul 10;11(1):3460. doi: 10.1038/s41467-020-17257-7.
The learning of stimulus-outcome associations allows for predictions about the environment. Ventral striatum and dopaminergic midbrain neurons form a larger network for generating reward prediction signals from sensory cues. Yet, the network plasticity mechanisms to generate predictive signals in these distributed circuits have not been entirely clarified. Also, direct evidence of the underlying interregional assembly formation and information transfer is still missing. Here we show that phasic dopamine is sufficient to reinforce the distinctness of stimulus representations in the ventral striatum even in the absence of reward. Upon such reinforcement, striatal stimulus encoding gives rise to interregional assemblies that drive dopaminergic neurons during stimulus-outcome learning. These assemblies dynamically encode the predicted reward value of conditioned stimuli. Together, our data reveal that ventral striatal and midbrain reward networks form a reinforcing loop to generate reward prediction coding.
刺激-结果关联的学习使得人们能够对环境做出预测。腹侧纹状体和多巴胺能中脑神经元形成了一个更大的网络,用于从感觉线索中产生奖励预测信号。然而,这些分布式电路中产生预测信号的网络可塑性机制尚未完全阐明。此外,关于潜在的区域间集合形成和信息传递的直接证据仍然缺失。在这里,我们表明,即使在没有奖励的情况下,脉冲式多巴胺也足以增强腹侧纹状体中刺激表示的独特性。在这种强化作用下,纹状体的刺激编码会产生区域间集合,在刺激-结果学习过程中驱动多巴胺能神经元。这些集合动态地对条件刺激的预测奖励值进行编码。总之,我们的数据表明,腹侧纹状体和中脑奖励网络形成了一个强化回路,以产生奖励预测编码。