Kahnt Thorsten, Schoenbaum Geoffrey
Intramural Research Program, National Institute on Drug Abuse, Baltimore, MD, USA.
Nat Rev Neurosci. 2025 Mar;26(3):169-178. doi: 10.1038/s41583-024-00898-8. Epub 2025 Jan 8.
Transient changes in the firing of midbrain dopamine neurons have been closely tied to the unidimensional value-based prediction error contained in temporal difference reinforcement learning models. However, whereas an abundance of work has now shown how well dopamine responses conform to the predictions of this hypothesis, far fewer studies have challenged its implicit assumption that dopamine is not involved in learning value-neutral features of reward. Here, we review studies in rats and humans that put this assumption to the test, and which suggest that dopamine transients provide a much richer signal that incorporates information that goes beyond integrated value.
中脑多巴胺神经元放电的瞬态变化与时间差分强化学习模型中包含的基于单维度价值的预测误差紧密相关。然而,尽管现在大量研究表明多巴胺反应与该假设的预测相符,但极少有研究对其隐含假设提出质疑,即多巴胺不参与奖励的价值中立特征的学习。在此,我们回顾了在大鼠和人类身上进行的研究,这些研究对这一假设进行了检验,并表明多巴胺瞬变提供了一个更为丰富的信号,其中包含了超出综合价值的信息。