Suppr超能文献

伏隔核多巴胺释放反映了工具性学习过程中的贝叶斯推理。

Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning.

作者信息

Qü Albert J, Tai Lung-Hao, Hall Christopher D, Tu Emilie M, Eckstein Maria K, Mishchanchuk Karyna, Lin Wan Chen, Chase Juliana B, MacAskill Andrew F, Collins Anne G E, Gershman Samuel J, Wilbrecht Linda

机构信息

Department of Psychology, University of California, Berkeley, CA, 94720, USA.

Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.

出版信息

bioRxiv. 2024 Sep 13:2023.11.10.566306. doi: 10.1101/2023.11.10.566306.

Abstract

Dopamine release in the nucleus accumbens has been hypothesized to signal reward prediction error, the difference between observed and predicted reward, suggesting a biological implementation for reinforcement learning. Rigorous tests of this hypothesis require assumptions about how the brain maps sensory signals to reward predictions, yet this mapping is still poorly understood. In particular, the mapping is non-trivial when sensory signals provide ambiguous information about the hidden state of the environment. Previous work using classical conditioning tasks has suggested that reward predictions are generated conditional on probabilistic beliefs about the hidden state, such that dopamine implicitly reflects these beliefs. Here we test this hypothesis in the context of an instrumental task (a two-armed bandit), where the hidden state switches repeatedly. We measured choice behavior and recorded dLight signals reflecting dopamine release in the nucleus accumbens core. Model comparison among a wide set of cognitive models based on the behavioral data favored models that used Bayesian updating of probabilistic beliefs. These same models also quantitatively matched the dopamine measurements better than non-Bayesian alternatives. We conclude that probabilistic belief computation contributes to instrumental task performance in mice and is reflected in mesolimbic dopamine signaling.

摘要

伏隔核中的多巴胺释放被认为是奖励预测误差的信号,即观察到的奖励与预测奖励之间的差异,这表明强化学习存在生物学机制。对这一假设进行严格测试需要假设大脑如何将感觉信号映射到奖励预测,但这种映射仍未得到充分理解。特别是,当感觉信号提供有关环境隐藏状态的模糊信息时,这种映射就变得很复杂。先前使用经典条件任务的研究表明,奖励预测是基于对隐藏状态的概率信念生成的,因此多巴胺隐含地反映了这些信念。在这里,我们在一个工具性任务(双臂赌博机)的背景下测试这一假设,其中隐藏状态会反复切换。我们测量了选择行为,并记录了反映伏隔核核心多巴胺释放的dLight信号。基于行为数据的一系列认知模型之间的模型比较支持使用概率信念贝叶斯更新的模型。这些相同的模型在定量上也比非贝叶斯模型更能匹配多巴胺测量结果。我们得出结论,概率信念计算有助于小鼠的工具性任务表现,并反映在中脑边缘多巴胺信号中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfe2/11423117/0c6bf9651ee2/nihpp-2023.11.10.566306v2-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验