Suppr超能文献

伏隔核多巴胺释放反映了工具性学习过程中的贝叶斯推理。

Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning.

作者信息

Qü Albert J, Tai Lung-Hao, Hall Christopher D, Tu Emilie M, Eckstein Maria K, Mishchanchuk Karyna, Lin Wan Chen, Chase Juliana B, MacAskill Andrew F, Collins Anne G E, Gershman Samuel J, Wilbrecht Linda

机构信息

Department of Psychology, University of California, Berkeley, Berkeley, California, United States of America.

Center for Computational Biology, University of California, Berkeley, Berkeley, California, United States of America.

出版信息

PLoS Comput Biol. 2025 Jul 2;21(7):e1013226. doi: 10.1371/journal.pcbi.1013226. eCollection 2025 Jul.

Abstract

Dopamine release in the nucleus accumbens has been hypothesized to signal the difference between observed and predicted reward, known as reward prediction error, suggesting a biological implementation for reinforcement learning. Rigorous tests of this hypothesis require assumptions about how the brain maps sensory signals to reward predictions, yet this mapping is still poorly understood. In particular, the mapping is non-trivial when sensory signals provide ambiguous information about the hidden state of the environment. Previous work using classical conditioning tasks has suggested that reward predictions are generated conditional on probabilistic beliefs about the hidden state, such that dopamine implicitly reflects these beliefs. Here we test this hypothesis in the context of an instrumental task (a two-armed bandit), where the hidden state switches stochastically. We measured choice behavior and recorded dLight signals that reflect dopamine release in the nucleus accumbens core. Model comparison among a wide set of cognitive models based on the behavioral data favored models that used Bayesian updating of probabilistic beliefs. These same models also quantitatively matched mesolimbic dLight measurements better than non-Bayesian alternatives. We conclude that probabilistic belief computation contributes to instrumental task performance in mice and is reflected in mesolimbic dopamine signaling.

摘要

伏隔核中的多巴胺释放被假定为可指示观察到的奖励与预测奖励之间的差异,即奖励预测误差,这表明强化学习具有生物学机制。对这一假设进行严格测试需要对大脑如何将感觉信号映射到奖励预测做出假设,但这种映射仍未得到充分理解。特别是,当感觉信号提供有关环境隐藏状态的模糊信息时,这种映射并非易事。先前使用经典条件任务的研究表明,奖励预测是基于对隐藏状态的概率信念而产生的,因此多巴胺隐含地反映了这些信念。在此,我们在工具性任务(双臂赌博机)的背景下测试这一假设,其中隐藏状态会随机切换。我们测量了选择行为,并记录了反映伏隔核核心中多巴胺释放的dLight信号。基于行为数据对一系列认知模型进行的模型比较显示,使用概率信念的贝叶斯更新的模型更受青睐。这些模型在定量上也比非贝叶斯模型更能匹配中脑边缘dLight测量结果。我们得出结论,概率信念计算有助于小鼠的工具性任务表现,并反映在中脑边缘多巴胺信号中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a520/12233953/5208a03bf7c8/pcbi.1013226.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验