• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Neural correlates of forward planning in a spatial decision task in humans.人类在空间决策任务中进行前瞻性规划的神经关联。
J Neurosci. 2011 Apr 6;31(14):5526-39. doi: 10.1523/JNEUROSCI.4647-10.2011.
2
Multiple memory systems as substrates for multiple decision systems.多种记忆系统作为多种决策系统的基础。
Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.
3
Signals in human striatum are appropriate for policy update rather than value prediction.人类纹状体中的信号适合用于策略更新,而不是价值预测。
J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011.
4
Brain signals of a Surprise-Actor-Critic model: Evidence for multiple learning modules in human decision making.惊奇行动者-评论家模型的脑信号:人类决策中多个学习模块的证据。
Neuroimage. 2022 Feb 1;246:118780. doi: 10.1016/j.neuroimage.2021.118780. Epub 2021 Dec 5.
5
Neural and psychological maturation of decision-making in adolescence and young adulthood.青少年和青年期决策的神经和心理成熟。
J Cogn Neurosci. 2013 Nov;25(11):1807-23. doi: 10.1162/jocn_a_00447. Epub 2013 Jul 16.
6
Neural correlates of reinforcement learning and social preferences in competitive bidding.竞争投标中强化学习和社会偏好的神经关联。
J Neurosci. 2013 Jan 30;33(5):2137-46. doi: 10.1523/JNEUROSCI.3095-12.2013.
7
Behavioral and neural predictors of upcoming decisions.即将做出决策的行为和神经预测因素。
Cogn Affect Behav Neurosci. 2005 Jun;5(2):117-26. doi: 10.3758/cabn.5.2.117.
8
Congruence of Inherent and Acquired Values Facilitates Reward-Based Decision-Making.内在价值观与后天习得价值观的一致性有助于基于奖励的决策。
J Neurosci. 2016 May 4;36(18):5003-12. doi: 10.1523/JNEUROSCI.3084-15.2016.
9
The neural correlates of subjective value during intertemporal choice.跨期选择过程中主观价值的神经关联。
Nat Neurosci. 2007 Dec;10(12):1625-33. doi: 10.1038/nn2007. Epub 2007 Nov 4.
10
Neural correlates of reward-based spatial learning in persons with cocaine dependence.可卡因依赖者基于奖励的空间学习的神经相关性。
Neuropsychopharmacology. 2014 Feb;39(3):545-55. doi: 10.1038/npp.2013.189. Epub 2013 Aug 6.

引用本文的文献

1
Heuristic pruning of decision trees at low probabilities and probability discounting in sequential planning in young and older adults.年轻人和老年人在顺序规划中基于低概率的决策树启发式剪枝及概率折扣
Sci Rep. 2025 May 9;15(1):16260. doi: 10.1038/s41598-025-00905-7.
2
The role of training variability for model-based and model-free learning of an arbitrary visuomotor mapping.基于模型和无模型学习任意视动映射的训练变异性的作用。
PLoS Comput Biol. 2024 Sep 27;20(9):e1012471. doi: 10.1371/journal.pcbi.1012471. eCollection 2024 Sep.
3
Human navigation strategies and their errors result from dynamic interactions of spatial uncertainties.人类的导航策略及其错误是由空间不确定性的动态相互作用产生的。
Nat Commun. 2024 Jul 6;15(1):5677. doi: 10.1038/s41467-024-49722-y.
4
Belief inference for hierarchical hidden states in spatial navigation.空间导航中分层隐藏状态的信念推断。
Commun Biol. 2024 May 21;7(1):614. doi: 10.1038/s42003-024-06316-0.
5
The neural correlates of memory integration in value-based decision-making during human spatial navigation.基于价值的决策中人类空间导航时记忆整合的神经关联。
Neuropsychologia. 2024 Jan 29;193:108758. doi: 10.1016/j.neuropsychologia.2023.108758. Epub 2023 Dec 14.
6
Accounting for multiscale processing in adaptive real-world decision-making via the hippocampus.通过海马体在适应性现实世界决策中考虑多尺度处理。
Front Neurosci. 2023 Sep 5;17:1200842. doi: 10.3389/fnins.2023.1200842. eCollection 2023.
7
Dual credit assignment processes underlie dopamine signals in a complex spatial environment.双信用分配过程是复杂空间环境中多巴胺信号的基础。
Neuron. 2023 Nov 1;111(21):3465-3478.e7. doi: 10.1016/j.neuron.2023.07.017. Epub 2023 Aug 22.
8
Dual credit assignment processes underlie dopamine signals in a complex spatial environment.在复杂空间环境中,双重学分分配过程是多巴胺信号的基础。
bioRxiv. 2023 Mar 19:2023.02.15.528738. doi: 10.1101/2023.02.15.528738.
9
Brain mechanisms underlying the influence of emotions on spatial decision-making: An EEG study.情绪对空间决策影响的脑机制:一项脑电图研究。
Front Neurosci. 2022 Sep 30;16:989988. doi: 10.3389/fnins.2022.989988. eCollection 2022.
10
A comparison of reinforcement learning models of human spatial navigation.人类空间导航的强化学习模型比较。
Sci Rep. 2022 Aug 17;12(1):13923. doi: 10.1038/s41598-022-18245-1.

本文引用的文献

1
Lateralized human hippocampal activity predicts navigation based on sequence or place memory.人类海马体的侧化活动可以预测基于序列或位置记忆的导航。
Proc Natl Acad Sci U S A. 2010 Aug 10;107(32):14466-71. doi: 10.1073/pnas.1004243107. Epub 2010 Jul 26.
2
States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning.状态与奖励:基于模型和无模型强化学习的分离神经预测误差信号。
Neuron. 2010 May 27;66(4):585-95. doi: 10.1016/j.neuron.2010.04.016.
3
Place versus response learning in the simple T-maze.简单T型迷宫中的位置与反应学习
J Exp Psychol. 1947 Oct;37(5):412-22. doi: 10.1037/h0059305.
4
Role of striatum in updating values of chosen actions.纹状体在更新所选动作价值中的作用。
J Neurosci. 2009 Nov 25;29(47):14701-12. doi: 10.1523/JNEUROSCI.2728-09.2009.
5
Neural computations underlying action-based decision making in the human brain.人类大脑基于行动的决策的神经计算。
Proc Natl Acad Sci U S A. 2009 Oct 6;106(40):17199-204. doi: 10.1073/pnas.0901077106. Epub 2009 Sep 28.
6
Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex.人类腹内侧前额叶皮层中不同商品决策价值存在共同表征的证据。
J Neurosci. 2009 Sep 30;29(39):12315-20. doi: 10.1523/JNEUROSCI.2575-09.2009.
7
The human parahippocampal cortex subserves egocentric spatial learning during navigation in a virtual maze.人类海马旁回皮层在虚拟迷宫导航中的自我中心空间学习中起作用。
Neurobiol Learn Mem. 2010 Jan;93(1):46-55. doi: 10.1016/j.nlm.2009.08.003. Epub 2009 Aug 13.
8
A model of episodic memory: mental time travel along encoded trajectories using grid cells.情景记忆模型:利用网格细胞沿着编码轨迹进行心理时间旅行。
Neurobiol Learn Mem. 2009 Nov;92(4):559-73. doi: 10.1016/j.nlm.2009.07.005. Epub 2009 Jul 15.
9
Midbrain dopamine neurons signal preference for advance information about upcoming rewards.中脑多巴胺神经元对即将到来的奖励的提前信息表现出信号偏好。
Neuron. 2009 Jul 16;63(1):119-26. doi: 10.1016/j.neuron.2009.06.009.
10
Circular analysis in systems neuroscience: the dangers of double dipping.系统神经科学中的循环分析:二次利用数据的风险。
Nat Neurosci. 2009 May;12(5):535-40. doi: 10.1038/nn.2303.

人类在空间决策任务中进行前瞻性规划的神经关联。

Neural correlates of forward planning in a spatial decision task in humans.

机构信息

Department of Psychology, Center for Neural Science, New York University, New York, New York 10003, USA.

出版信息

J Neurosci. 2011 Apr 6;31(14):5526-39. doi: 10.1523/JNEUROSCI.4647-10.2011.

DOI:10.1523/JNEUROSCI.4647-10.2011
PMID:21471389
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3108440/
Abstract

Although reinforcement learning (RL) theories have been influential in characterizing the mechanisms for reward-guided choice in the brain, the predominant temporal difference (TD) algorithm cannot explain many flexible or goal-directed actions that have been demonstrated behaviorally. We investigate such actions by contrasting an RL algorithm that is model based, in that it relies on learning a map or model of the task and planning within it, to traditional model-free TD learning. To distinguish these approaches in humans, we used functional magnetic resonance imaging in a continuous spatial navigation task, in which frequent changes to the layout of the maze forced subjects continually to relearn their favored routes, thereby exposing the RL mechanisms used. We sought evidence for the neural substrates of such mechanisms by comparing choice behavior and blood oxygen level-dependent (BOLD) signals to decision variables extracted from simulations of either algorithm. Both choices and value-related BOLD signals in striatum, although most often associated with TD learning, were better explained by the model-based theory. Furthermore, predecessor quantities for the model-based value computation were correlated with BOLD signals in the medial temporal lobe and frontal cortex. These results point to a significant extension of both the computational and anatomical substrates for RL in the brain.

摘要

虽然强化学习 (RL) 理论在描述大脑中奖励引导选择的机制方面具有影响力,但占主导地位的时间差分 (TD) 算法无法解释许多已经在行为上证明的灵活或目标导向的行为。我们通过对比基于模型的 RL 算法与传统的无模型 TD 学习来研究这些行为,因为前者依赖于学习任务的地图或模型并在其中进行规划。为了在人类中区分这些方法,我们在连续空间导航任务中使用了功能磁共振成像,其中迷宫的布局经常发生变化,迫使受试者不断重新学习他们喜欢的路线,从而暴露了所使用的 RL 机制。我们通过将选择行为和血氧水平依赖 (BOLD) 信号与从两种算法的模拟中提取的决策变量进行比较,来寻找这些机制的神经基础的证据。纹状体中的选择和与价值相关的 BOLD 信号虽然最常与 TD 学习相关,但通过基于模型的理论得到了更好的解释。此外,基于模型的价值计算的前体数量与内侧颞叶和额叶皮层中的 BOLD 信号相关。这些结果表明,大脑中的 RL 在计算和解剖学基础方面都有了重大扩展。