• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分布式价值表示和混合学习策略的计算机制。

Computational mechanisms of distributed value representations and mixed learning strategies.

机构信息

Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.

Center for Computational Neuroscience, Flatiron Institute, Simons Foundation, New York, NY, USA.

出版信息

Nat Commun. 2021 Dec 10;12(1):7191. doi: 10.1038/s41467-021-27413-2.

DOI:10.1038/s41467-021-27413-2
PMID:34893597
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8664930/
Abstract

Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human participants estimate stimulus-outcome associations by learning and combining estimates of reward probabilities associated with the informative feature followed by those of informative conjunctions. Through analyzing representations, connectivity, and lesioning of the RNNs, we demonstrate this mixed learning strategy relies on a distributed neural code and opponency between excitatory and inhibitory neurons through value-dependent disinhibition. Together, our results suggest computational and neural mechanisms underlying emergence of complex learning strategies in naturalistic settings.

摘要

在现实世界中,学习奖励环境的适当表示形式具有挑战性,因为存在许多选项,每个选项都具有多个属性或特征。尽管存在针对这一挑战的替代解决方案,但价值表示和学习策略出现和采用的神经机制仍不清楚。为了解决这个问题,我们在人类和经过训练的递归神经网络 (RNN) 中进行了多维概率学习任务,以捕获我们的实验观察结果。我们发现,人类参与者通过学习和组合与信息特征相关的奖励概率估计值,以及与信息组合相关的估计值,来估计刺激-结果关联。通过分析 RNN 的表示、连接和损伤,我们证明这种混合学习策略依赖于分布式神经代码以及兴奋性和抑制性神经元之间的对立,通过价值依赖性去抑制。总的来说,我们的结果表明,在自然环境中出现复杂学习策略的计算和神经机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/8c108a5ae065/41467_2021_27413_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/2c63406c91f8/41467_2021_27413_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/70f5da949571/41467_2021_27413_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/007ae43af648/41467_2021_27413_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/25574567895e/41467_2021_27413_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/fd09a6b37c4f/41467_2021_27413_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/727ca46dd4fc/41467_2021_27413_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/34cf4b78d682/41467_2021_27413_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/8c108a5ae065/41467_2021_27413_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/2c63406c91f8/41467_2021_27413_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/70f5da949571/41467_2021_27413_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/007ae43af648/41467_2021_27413_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/25574567895e/41467_2021_27413_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/fd09a6b37c4f/41467_2021_27413_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/727ca46dd4fc/41467_2021_27413_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/34cf4b78d682/41467_2021_27413_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d45/8664930/8c108a5ae065/41467_2021_27413_Fig8_HTML.jpg

相似文献

1
Computational mechanisms of distributed value representations and mixed learning strategies.分布式价值表示和混合学习策略的计算机制。
Nat Commun. 2021 Dec 10;12(1):7191. doi: 10.1038/s41467-021-27413-2.
2
Performance of a Computational Model of the Mammalian Olfactory System哺乳动物嗅觉系统计算模型的性能
3
Influence of learning strategy on response time during complex value-based learning and choice.学习策略对基于复杂价值的学习和选择过程中反应时的影响。
PLoS One. 2018 May 22;13(5):e0197263. doi: 10.1371/journal.pone.0197263. eCollection 2018.
4
Stimulus-Driven and Spontaneous Dynamics in Excitatory-Inhibitory Recurrent Neural Networks for Sequence Representation.兴奋性抑制性递归神经网络中的刺激驱动和自发动力学用于序列表示。
Neural Comput. 2021 Sep 16;33(10):2603-2645. doi: 10.1162/neco_a_01418.
5
Emergence of belief-like representations through reinforcement learning.通过强化学习产生类似信念的表征。
bioRxiv. 2023 Apr 4:2023.04.04.535512. doi: 10.1101/2023.04.04.535512.
6
Previously Reward-Associated Stimuli Capture Spatial Attention in the Absence of Changes in the Corresponding Sensory Representations as Measured with MEG.先前的研究表明,即使相应的感觉表征没有变化,与奖励相关的刺激也可以在 MEG 测量中捕获空间注意力。
J Neurosci. 2020 Jun 24;40(26):5033-5050. doi: 10.1523/JNEUROSCI.1172-19.2020. Epub 2020 May 4.
7
PsychRNN: An Accessible and Flexible Python Package for Training Recurrent Neural Network Models on Cognitive Tasks.PsychRNN:一个用于在认知任务上训练递归神经网络模型的易于访问和灵活的 Python 包。
eNeuro. 2021 Jan 15;8(1). doi: 10.1523/ENEURO.0427-20.2020. Print 2021 Jan-Feb.
8
Perceptual Salience and Reward Both Influence Feedback-Related Neural Activity Arising from Choice.知觉显著性和奖励都会影响因选择而产生的与反馈相关的神经活动。
J Neurosci. 2015 Sep 23;35(38):13064-75. doi: 10.1523/JNEUROSCI.1601-15.2015.
9
Training Excitatory-Inhibitory Recurrent Neural Networks for Cognitive Tasks: A Simple and Flexible Framework.用于认知任务的兴奋性-抑制性循环神经网络训练:一个简单灵活的框架。
PLoS Comput Biol. 2016 Feb 29;12(2):e1004792. doi: 10.1371/journal.pcbi.1004792. eCollection 2016 Feb.
10
Task representations in neural networks trained to perform many cognitive tasks.神经网络中执行多项认知任务的任务表示。
Nat Neurosci. 2019 Feb;22(2):297-306. doi: 10.1038/s41593-018-0310-2. Epub 2019 Jan 14.

引用本文的文献

1
Neuronal Decoding of Decisions in Multidimensional Feature Space Using a Gated Recurrent Variational Autoencoder.使用门控循环变分自编码器对多维特征空间中的决策进行神经元解码。
bioRxiv. 2025 Aug 25:2025.08.20.671126. doi: 10.1101/2025.08.20.671126.
2
Contributions of Attention to Learning in Multidimensional Reward Environments.在多维奖励环境中注意力对学习的贡献。
J Neurosci. 2025 Feb 12;45(7):e2300232024. doi: 10.1523/JNEUROSCI.2300-23.2024.
3
Visual perceptual learning of feature conjunctions leverages non-linear mixed selectivity.

本文引用的文献

1
Computational models of adaptive behavior and prefrontal cortex.自适应行为和前额叶皮层的计算模型。
Neuropsychopharmacology. 2022 Jan;47(1):58-71. doi: 10.1038/s41386-021-01123-1. Epub 2021 Aug 13.
2
Timescales of Cognition in the Brain.大脑认知的时间尺度
Curr Opin Behav Sci. 2021 Oct;41:30-37. doi: 10.1016/j.cobeha.2021.03.003. Epub 2021 Mar 31.
3
Meta-Learning in Neural Networks: A Survey.元学习在神经网络中的研究进展综述
特征联结的视觉感知学习利用了非线性混合选择性。
NPJ Sci Learn. 2024 Mar 1;9(1):13. doi: 10.1038/s41539-024-00226-w.
4
Computational models of adaptive behavior and prefrontal cortex.自适应行为和前额叶皮层的计算模型。
Neuropsychopharmacology. 2022 Jan;47(1):58-71. doi: 10.1038/s41386-021-01123-1. Epub 2021 Aug 13.
IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5149-5169. doi: 10.1109/TPAMI.2021.3079209. Epub 2022 Aug 4.
4
Learning arbitrary stimulus-reward associations for naturalistic stimuli involves transition from learning about features to learning about objects.学习自然刺激的任意刺激-奖励关联涉及从学习特征到学习对象的转变。
Cognition. 2020 Dec;205:104425. doi: 10.1016/j.cognition.2020.104425. Epub 2020 Sep 19.
5
Multiple timescales of neural dynamics and integration of task-relevant signals across cortex.跨皮质的神经动力学的多个时间尺度和与任务相关的信号的整合。
Proc Natl Acad Sci U S A. 2020 Sep 8;117(36):22522-22531. doi: 10.1073/pnas.2005993117. Epub 2020 Aug 24.
6
Backpropagation and the brain.反向传播与大脑。
Nat Rev Neurosci. 2020 Jun;21(6):335-346. doi: 10.1038/s41583-020-0277-3. Epub 2020 Apr 17.
7
Combinations of low-level and high-level neural processes account for distinct patterns of context-dependent choice.低水平和高水平神经过程的组合解释了上下文相关选择的不同模式。
PLoS Comput Biol. 2019 Oct 14;15(10):e1007427. doi: 10.1371/journal.pcbi.1007427. eCollection 2019 Oct.
8
Models that learn how humans learn: The case of decision-making and its disorders.学习人类如何学习的模型:以决策及其障碍为例。
PLoS Comput Biol. 2019 Jun 11;15(6):e1006903. doi: 10.1371/journal.pcbi.1006903. eCollection 2019 Jun.
9
Circuit mechanisms for the maintenance and manipulation of information in working memory.工作记忆中信息的维持和操作的电路机制。
Nat Neurosci. 2019 Jul;22(7):1159-1167. doi: 10.1038/s41593-019-0414-3. Epub 2019 Jun 10.
10
Holistic Reinforcement Learning: The Role of Structure and Attention.整体强化学习:结构与注意力的作用。
Trends Cogn Sci. 2019 Apr;23(4):278-292. doi: 10.1016/j.tics.2019.01.010. Epub 2019 Feb 26.