Suppr超能文献

背侧纹状体中的多巴胺释放平台和结果信号与经典的强化学习公式形成对比。

Dopamine release plateau and outcome signals in dorsal striatum contrast with classic reinforcement learning formulations.

机构信息

McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 43 Vassar St., Cambridge, MA, 02139, USA.

Advanced Imaging Research Center, University of Texas, Southwestern Medical Center, Dallas, TX, 75390, USA.

出版信息

Nat Commun. 2024 Oct 14;15(1):8856. doi: 10.1038/s41467-024-53176-7.

Abstract

We recorded dopamine release signals in centromedial and centrolateral sectors of the striatum as mice learned consecutive versions of visual cue-outcome conditioning tasks. Dopamine release responses differed for the centromedial and centrolateral sites. In neither sector could these be accounted for by classic reinforcement learning alone as classically applied to the activity of nigral dopamine-containing neurons. Medially, cue responses ranged from initial sharp peaks to modulated plateau responses; outcome (reward) responses during cue conditioning were minimal or, initially, negative. At centrolateral sites, by contrast, strong, transient dopamine release responses occurred at both cue and outcome. Prolonged, plateau release responses to cues emerged in both regions when discriminative behavioral responses became required. At most sites, we found no evidence for a transition from outcome signaling to cue signaling, a hallmark of temporal difference reinforcement learning as applied to midbrain dopaminergic neuronal activity. These findings delineate a reshaping of striatal dopamine release activity during learning and suggest that current views of reward prediction error encoding need review to accommodate distinct learning-related spatial and temporal patterns of striatal dopamine release in the dorsal striatum.

摘要

我们在中脑边缘和中脑侧区记录了多巴胺释放信号,因为老鼠学习了连续的视觉线索-结果条件作用任务版本。多巴胺释放反应在中脑边缘和中脑侧区有所不同。在这两个区域,经典强化学习都不能单独解释这些反应,因为经典强化学习应用于含有多巴胺的黑质神经元的活动。在中脑,线索反应从最初的急剧峰值到调制的平台反应不等;在线索条件作用期间,结果(奖励)反应最小化或最初为负。相比之下,在中脑侧区,线索和结果都会产生强烈的、短暂的多巴胺释放反应。当需要区分行为反应时,两个区域都会出现对线索的延长、平台释放反应。在大多数部位,我们没有发现从结果信号到线索信号的转变的证据,这是应用于中脑多巴胺能神经元活动的时间差强化学习的一个标志。这些发现描绘了学习过程中纹状体多巴胺释放活动的重塑,并表明需要重新审视当前关于奖励预测误差编码的观点,以适应背侧纹状体中纹状体多巴胺释放的不同学习相关的空间和时间模式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9862/11473536/3825e71eaca3/41467_2024_53176_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验