Department of Psychology, University of Sydney, Camperdown, New South Wales 2006, Australia
Department of Psychology, University of California, Los Angeles 90095, California.
J Neurosci. 2024 Aug 28;44(35):e0120242024. doi: 10.1523/JNEUROSCI.0120-24.2024.
Dopamine release in the nucleus accumbens core (NAcC) is generally considered to be a proxy for phasic firing of the ventral tegmental area dopamine (VTA) neurons. Thus, dopamine release in NAcC is hypothesized to reflect a unitary role in reward prediction error signaling. However, recent studies reveal more diverse roles of dopamine neurons, which support an emerging idea that dopamine regulates learning differently in distinct circuits. To understand whether the NAcC might regulate a unique component of learning, we recorded dopamine release in NAcC while male rats performed a backward conditioning task where a reward is followed by a neutral cue. We used this task because we can delineate different components of learning, which include sensory-specific inhibitory and general excitatory components. Furthermore, we have shown that VTA neurons are necessary for both the specific and general components of backward associations. Here, we found that dopamine release in NAcC increased to the reward across learning while reducing to the cue that followed as it became more expected. This mirrors the dopamine prediction error signal seen during forward conditioning and cannot be accounted for temporal-difference reinforcement learning. Subsequent tests allowed us to dissociate these learning components and revealed that dopamine release in NAcC reflects the general excitatory component of backward associations, but not their sensory-specific component. These results emphasize the importance of examining distinct functions of different dopamine projections in reinforcement learning.
伏隔核核心(NAcC)中的多巴胺释放通常被认为是腹侧被盖区多巴胺(VTA)神经元相位性放电的代理。因此,NAcC 中的多巴胺释放被假设反映了在奖励预测误差信号中的单一作用。然而,最近的研究揭示了多巴胺神经元更多样化的作用,这支持了一个新兴的观点,即多巴胺在不同的回路中以不同的方式调节学习。为了了解 NAcC 是否可能调节学习的独特成分,我们在雄性大鼠进行反向条件反射任务时记录了 NAcC 中的多巴胺释放,在该任务中,奖励后跟随一个中性线索。我们使用这个任务是因为我们可以描绘出不同的学习成分,包括感觉特异性抑制和一般兴奋性成分。此外,我们已经表明,VTA 神经元是反向关联的特异性和一般成分所必需的。在这里,我们发现 NAcC 中的多巴胺释放随着学习而增加到奖励,而当线索变得更加可预测时,它会减少到线索。这与正向条件反射中看到的多巴胺预测误差信号相吻合,不能用时间差分强化学习来解释。随后的测试允许我们分离这些学习成分,并揭示 NAcC 中的多巴胺释放反映了反向关联的一般兴奋性成分,而不是它们的感觉特异性成分。这些结果强调了在强化学习中检查不同多巴胺投射的不同功能的重要性。