Suppr超能文献

背侧纹状体中的多巴胺释放平台和结果信号与经典的强化学习公式形成对比。

Dopamine release plateau and outcome signals in dorsal striatum contrast with classic reinforcement learning formulations.

机构信息

McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 43 Vassar St., Cambridge, MA, 02139, USA.

Advanced Imaging Research Center, University of Texas, Southwestern Medical Center, Dallas, TX, 75390, USA.

出版信息

Nat Commun. 2024 Oct 14;15(1):8856. doi: 10.1038/s41467-024-53176-7.

Abstract

We recorded dopamine release signals in centromedial and centrolateral sectors of the striatum as mice learned consecutive versions of visual cue-outcome conditioning tasks. Dopamine release responses differed for the centromedial and centrolateral sites. In neither sector could these be accounted for by classic reinforcement learning alone as classically applied to the activity of nigral dopamine-containing neurons. Medially, cue responses ranged from initial sharp peaks to modulated plateau responses; outcome (reward) responses during cue conditioning were minimal or, initially, negative. At centrolateral sites, by contrast, strong, transient dopamine release responses occurred at both cue and outcome. Prolonged, plateau release responses to cues emerged in both regions when discriminative behavioral responses became required. At most sites, we found no evidence for a transition from outcome signaling to cue signaling, a hallmark of temporal difference reinforcement learning as applied to midbrain dopaminergic neuronal activity. These findings delineate a reshaping of striatal dopamine release activity during learning and suggest that current views of reward prediction error encoding need review to accommodate distinct learning-related spatial and temporal patterns of striatal dopamine release in the dorsal striatum.

摘要

我们在中脑边缘和中脑侧区记录了多巴胺释放信号,因为老鼠学习了连续的视觉线索-结果条件作用任务版本。多巴胺释放反应在中脑边缘和中脑侧区有所不同。在这两个区域,经典强化学习都不能单独解释这些反应,因为经典强化学习应用于含有多巴胺的黑质神经元的活动。在中脑,线索反应从最初的急剧峰值到调制的平台反应不等;在线索条件作用期间,结果(奖励)反应最小化或最初为负。相比之下,在中脑侧区,线索和结果都会产生强烈的、短暂的多巴胺释放反应。当需要区分行为反应时,两个区域都会出现对线索的延长、平台释放反应。在大多数部位,我们没有发现从结果信号到线索信号的转变的证据,这是应用于中脑多巴胺能神经元活动的时间差强化学习的一个标志。这些发现描绘了学习过程中纹状体多巴胺释放活动的重塑,并表明需要重新审视当前关于奖励预测误差编码的观点,以适应背侧纹状体中纹状体多巴胺释放的不同学习相关的空间和时间模式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9862/11473536/3825e71eaca3/41467_2024_53176_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验