• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

认知变量的有效编码是多巴胺反应和选择行为的基础。

Efficient coding of cognitive variables underlies dopamine response and choice behavior.

机构信息

Champalimaud Neuroscience Programme, Champalimaud Foundation, Lisbon, Portugal.

Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA.

出版信息

Nat Neurosci. 2022 Jun;25(6):738-748. doi: 10.1038/s41593-022-01085-7. Epub 2022 Jun 6.

DOI:10.1038/s41593-022-01085-7
PMID:35668173
Abstract

Reward expectations based on internal knowledge of the external environment are a core component of adaptive behavior. However, internal knowledge may be inaccurate or incomplete due to errors in sensory measurements. Some features of the environment may also be encoded inaccurately to minimize representational costs associated with their processing. In this study, we investigated how reward expectations are affected by features of internal representations by studying behavior and dopaminergic activity while mice make time-based decisions. We show that several possible representations allow a reinforcement learning agent to model animals' overall performance during the task. However, only a small subset of highly compressed representations simultaneously reproduced the co-variability in animals' choice behavior and dopaminergic activity. Strikingly, these representations predict an unusual distribution of response times that closely match animals' behavior. These results inform how constraints of representational efficiency may be expressed in encoding representations of dynamic cognitive variables used for reward-based computations.

摘要

基于对外界环境内部知识的奖励预期是适应性行为的核心组成部分。然而,由于感官测量中的错误,内部知识可能会不准确或不完整。环境的某些特征也可能被不准确地编码,以最小化与处理相关的表示成本。在这项研究中,我们通过研究老鼠在基于时间的决策过程中的行为和多巴胺活动,研究了内部表示的特征如何影响奖励预期。我们表明,几种可能的表示形式允许强化学习代理在任务期间模拟动物的整体表现。然而,只有一小部分高度压缩的表示形式同时再现了动物选择行为和多巴胺活动的共变。引人注目的是,这些表示形式预测了一种不寻常的反应时间分布,与动物的行为非常吻合。这些结果表明,用于基于奖励的计算的动态认知变量的表示效率的约束条件如何在编码表示中得到表达。

相似文献

1
Efficient coding of cognitive variables underlies dopamine response and choice behavior.认知变量的有效编码是多巴胺反应和选择行为的基础。
Nat Neurosci. 2022 Jun;25(6):738-748. doi: 10.1038/s41593-022-01085-7. Epub 2022 Jun 6.
2
Reward-dependent learning in neuronal networks for planning and decision making.用于规划和决策的神经网络中基于奖励的学习。
Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.
3
A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.一种具有类似多巴胺强化信号的神经网络模型,用于学习空间延迟反应任务。
Neuroscience. 1999;91(3):871-90. doi: 10.1016/s0306-4522(98)00697-6.
4
A reinforcement learning mechanism responsible for the valuation of free choice.一种负责自由选择估值的强化学习机制。
Neuron. 2014 Aug 6;83(3):551-7. doi: 10.1016/j.neuron.2014.06.035. Epub 2014 Jul 24.
5
Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions.多巴胺能神经调节中的基因变异会影响快速灵活地调整决策的能力。
Proc Natl Acad Sci U S A. 2009 Oct 20;106(42):17951-6. doi: 10.1073/pnas.0905191106. Epub 2009 Oct 12.
6
Dopamine-mediated reinforcement learning signals in the striatum and ventromedial prefrontal cortex underlie value-based choices.纹状体和腹内侧前额叶皮层中的多巴胺介导的强化学习信号是基于价值的选择的基础。
J Neurosci. 2011 Feb 2;31(5):1606-13. doi: 10.1523/JNEUROSCI.3904-10.2011.
7
Dopaminergic Modulation of Human Intertemporal Choice: A Diffusion Model Analysis Using the D2-Receptor Antagonist Haloperidol.多巴胺能调节人类跨期选择:使用 D2 受体拮抗剂氟哌啶醇的扩散模型分析。
J Neurosci. 2020 Oct 7;40(41):7936-7948. doi: 10.1523/JNEUROSCI.0592-20.2020. Epub 2020 Sep 18.
8
Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits.纹状体多巴胺爬坡可能表明皮质基底神经节回路具有灵活的强化学习和遗忘能力。
Front Neural Circuits. 2014 Apr 9;8:36. doi: 10.3389/fncir.2014.00036. eCollection 2014.
9
Towards a Unifying Cognitive, Neurophysiological, and Computational Neuroscience Account of Schizophrenia.走向精神分裂症的统一认知神经生理学和计算神经科学解释。
Schizophr Bull. 2019 Sep 11;45(5):1092-1100. doi: 10.1093/schbul/sby154.
10
Multiplexing signals in reinforcement learning with internal models and dopamine.用内部模型和多巴胺对强化学习中的信号进行多路复用。
Curr Opin Neurobiol. 2014 Apr;25:123-9. doi: 10.1016/j.conb.2014.01.001. Epub 2014 Jan 23.

引用本文的文献

1
Integrated dopamine sensing and 40 Hz hippocampal stimulation improves cognitive performance in Alzheimer's mouse models.整合多巴胺传感与40赫兹海马体刺激可改善阿尔茨海默病小鼠模型的认知表现。
Nat Commun. 2025 Jul 1;16(1):5948. doi: 10.1038/s41467-025-60903-1.
2
Neural signatures of temporal anticipation in human cortex represent event probability density.人类大脑皮层中时间预期的神经特征代表事件概率密度。
Nat Commun. 2025 Mar 16;16(1):2602. doi: 10.1038/s41467-025-57813-7.
3
Reward prediction error neurons implement an efficient code for reward.

本文引用的文献

1
Optimal anticipatory control as a theory of motor preparation: A thalamo-cortical circuit model.最优预期控制作为运动准备理论:丘脑-皮层电路模型。
Neuron. 2021 May 5;109(9):1567-1581.e12. doi: 10.1016/j.neuron.2021.03.009. Epub 2021 Mar 30.
2
Linking Connectivity, Dynamics, and Computations in Low-Rank Recurrent Neural Networks.在低秩递归神经网络中连接连通性、动态和计算。
Neuron. 2018 Aug 8;99(3):609-623.e29. doi: 10.1016/j.neuron.2018.07.003. Epub 2018 Jul 26.
3
Flexible Sensorimotor Computations through Rapid Reconfiguration of Cortical Dynamics.
奖励预测误差神经元为奖励实施了一种有效的编码。
Nat Neurosci. 2024 Jul;27(7):1333-1339. doi: 10.1038/s41593-024-01671-x. Epub 2024 Jun 19.
4
Frontostriatal circuit dysfunction leads to cognitive inflexibility in neuroligin-3 R451C knockin mice.额纹皮质环路功能障碍导致神经连接蛋白 3 R451C 基因敲入小鼠认知灵活性下降。
Mol Psychiatry. 2024 Aug;29(8):2308-2320. doi: 10.1038/s41380-024-02505-9. Epub 2024 Mar 8.
通过快速重新配置皮层动态实现灵活的感觉运动计算。
Neuron. 2018 Jun 6;98(5):1005-1019.e5. doi: 10.1016/j.neuron.2018.05.020.
4
Prefrontal cortex as a meta-reinforcement learning system.前额皮质作为一个元强化学习系统。
Nat Neurosci. 2018 Jun;21(6):860-868. doi: 10.1038/s41593-018-0147-8. Epub 2018 May 14.
5
Flexible timing by temporal scaling of cortical responses.通过皮质响应的时间缩放实现灵活的定时。
Nat Neurosci. 2018 Jan;21(1):102-110. doi: 10.1038/s41593-017-0028-6. Epub 2017 Dec 4.
6
Predictive representations can link model-based reinforcement learning to model-free mechanisms.预测性表征可以将基于模型的强化学习与无模型机制联系起来。
PLoS Comput Biol. 2017 Sep 25;13(9):e1005768. doi: 10.1371/journal.pcbi.1005768. eCollection 2017 Sep.
7
Neural Circuitry of Reward Prediction Error.奖励预测误差的神经回路
Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.
8
Midbrain Dopamine Neurons Signal Belief in Choice Accuracy during a Perceptual Decision.中脑多巴胺神经元在知觉决策中对选择准确性的置信度进行信号传递。
Curr Biol. 2017 Mar 20;27(6):821-832. doi: 10.1016/j.cub.2017.02.026. Epub 2017 Mar 9.
9
Dopamine reward prediction errors reflect hidden-state inference across time.多巴胺奖励预测误差反映了跨时间的隐藏状态推理。
Nat Neurosci. 2017 Apr;20(4):581-589. doi: 10.1038/nn.4520. Epub 2017 Mar 6.
10
Reward-based training of recurrent neural networks for cognitive and value-based tasks.用于认知和基于价值任务的循环神经网络的基于奖励的训练。
Elife. 2017 Jan 13;6:e21492. doi: 10.7554/eLife.21492.