人类强化学习过程中动态决策参数的功能磁共振成像和脑电图预测指标

fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning.

作者信息

Frank Michael J, Gagne Chris, Nyhus Erika, Masters Sean, Wiecki Thomas V, Cavanagh James F, Badre David

机构信息

Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, Rhode Island 02912, Brown Institute for Brain Science, Providence, Rhode Island 09212, Department of Psychiatry and Human Behavior, Brown University, Providence, Rhode Island 02912,

Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, Rhode Island 02912.

出版信息

J Neurosci. 2015 Jan 14;35(2):485-94. doi: 10.1523/JNEUROSCI.2036-14.2015.

DOI:10.1523/JNEUROSCI.2036-14.2015

PMID:25589744

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4293405/

Abstract

What are the neural dynamics of choice processes during reinforcement learning? Two largely separate literatures have examined dynamics of reinforcement learning (RL) as a function of experience but assuming a static choice process, or conversely, the dynamics of choice processes in decision making but based on static decision values. Here we show that human choice processes during RL are well described by a drift diffusion model (DDM) of decision making in which the learned trial-by-trial reward values are sequentially sampled, with a choice made when the value signal crosses a decision threshold. Moreover, simultaneous fMRI and EEG recordings revealed that this decision threshold is not fixed across trials but varies as a function of activity in the subthalamic nucleus (STN) and is further modulated by trial-by-trial measures of decision conflict and activity in the dorsomedial frontal cortex (pre-SMA BOLD and mediofrontal theta in EEG). These findings provide converging multimodal evidence for a model in which decision threshold in reward-based tasks is adjusted as a function of communication from pre-SMA to STN when choices differ subtly in reward values, allowing more time to choose the statistically more rewarding option.

摘要

强化学习过程中选择过程的神经动力学是什么？有两大相对独立的文献分别研究了强化学习（RL）的动力学，将其视为经验的函数，但假定选择过程是静态的；或者相反，研究了决策过程中选择过程的动力学，但基于静态决策值。在这里，我们表明强化学习过程中的人类选择过程可以通过一种决策的漂移扩散模型（DDM）很好地描述，在该模型中，逐次试验学习到的奖励值被顺序采样，当价值信号超过决策阈值时做出选择。此外，同时进行的功能磁共振成像（fMRI）和脑电图（EEG）记录显示，这个决策阈值在不同试验中不是固定的，而是随着丘脑底核（STN）的活动而变化，并进一步受到决策冲突的逐次试验测量以及背内侧前额叶皮层（脑电图中的前辅助运动区BOLD信号和额内侧θ波）活动的调节。这些发现为一个模型提供了多模态的一致证据，在该模型中，当选择在奖励值上有细微差异时，基于奖励任务中的决策阈值会根据从前辅助运动区到丘脑底核的通信进行调整，从而有更多时间选择统计学上更有奖励的选项。

相似文献

fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning.人类强化学习过程中动态决策参数的功能磁共振成像和脑电图预测指标

J Neurosci. 2015 Jan 14;35(2):485-94. doi: 10.1523/JNEUROSCI.2036-14.2015.

Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold.丘脑底核刺激可逆转前额叶对决策阈值的影响。

Nat Neurosci. 2011 Sep 25;14(11):1462-7. doi: 10.1038/nn.2925.

Cross-Task Contributions of Frontobasal Ganglia Circuitry in Response Inhibition and Conflict-Induced Slowing.额顶眶额皮层-基底神经节回路在反应抑制和冲突诱发减速中的跨任务贡献。

Cereb Cortex. 2019 May 1;29(5):1969-1983. doi: 10.1093/cercor/bhy076.

Midline frontal cortex low-frequency activity drives subthalamic nucleus oscillations during conflict.中线额皮质的低频活动在冲突期间驱动丘脑底核的振荡。

J Neurosci. 2014 May 21;34(21):7322-33. doi: 10.1523/JNEUROSCI.1169-14.2014.

Deep Brain Stimulation of the Subthalamic Nucleus Does Not Affect the Decrease of Decision Threshold during the Choice Process When There Is No Conflict, Time Pressure, or Reward.深部脑刺激丘脑底核不会影响无冲突、无时间压力或无奖励时选择过程中决策阈值的降低。

J Cogn Neurosci. 2018 Jun;30(6):876-884. doi: 10.1162/jocn_a_01252. Epub 2018 Feb 28.

Frontal theta links prediction errors to behavioral adaptation in reinforcement learning.额部θ节律将预测误差与强化学习中的行为适应联系起来。

Neuroimage. 2010 Feb 15;49(4):3198-209. doi: 10.1016/j.neuroimage.2009.11.080. Epub 2009 Dec 5.

Frontal theta overrides pavlovian learning biases.额部θ波抑制巴甫洛夫式学习偏向。

J Neurosci. 2013 May 8;33(19):8541-8. doi: 10.1523/JNEUROSCI.5754-12.2013.

Decomposing the effects of context valence and feedback information on speed and accuracy during reinforcement learning: a meta-analytical approach using diffusion decision modeling.使用扩散决策模型对强化学习过程中上下文效价和反馈信息对速度和准确性的影响进行分解：一项元分析方法。

Cogn Affect Behav Neurosci. 2019 Jun;19(3):490-502. doi: 10.3758/s13415-019-00723-1.

Reinforcement-based decision making in corticostriatal circuits: mutual constraints by neurocomputational and diffusion models.基于强化的皮质纹状体回路决策：神经计算和扩散模型的相互约束。

Neural Comput. 2012 May;24(5):1186-229. doi: 10.1162/NECO_a_00270. Epub 2012 Feb 1.

The subthalamic nucleus during decision-making with multiple alternatives.决策过程中面对多种选择时的底丘脑核。

Hum Brain Mapp. 2015 Oct;36(10):4041-4052. doi: 10.1002/hbm.22896. Epub 2015 Jul 15.

引用本文的文献

Prior probability biases perceptual choices by modulating the accumulation rate, rather than the baseline, of decision evidence.先验概率通过调节决策证据的积累速率而非基线来影响感知选择。

Imaging Neurosci (Camb). 2024 Nov 18;2. doi: 10.1162/imag_a_00338. eCollection 2024.

Attentional dysfunction arises from right frontocentral and occipital network connectivity in Parkinson's disease.注意力功能障碍源于帕金森病中右侧额中央和枕叶网络的连接性。

Neuroimage Rep. 2025 Feb 10;5(1):100241. doi: 10.1016/j.ynirp.2025.100241. eCollection 2025 Mar.

Response-locked theta dissociations reveal potential feedback signal following successful retrieval.反应锁定的θ波解离揭示了成功检索后潜在的反馈信号。

Imaging Neurosci (Camb). 2024 Jun 27;2:1-16. doi: 10.1162/imag_a_00207. eCollection 2024 Jun 1.

Cross-modal congruency modulates evidence accumulation, not decision thresholds.跨模态一致性调节证据积累，而非决策阈值。

Front Neurosci. 2025 Feb 20;19:1513083. doi: 10.3389/fnins.2025.1513083. eCollection 2025.

Toward a Mechanistic Understanding of Reading Difficulties: Deviant Audiovisual Learning Dynamics and Network Connectivity in Children with Poor Reading Skills.迈向对阅读困难的机制性理解：阅读技能差的儿童中异常的视听学习动态与网络连通性

J Neurosci. 2025 Apr 23;45(17):e1119242025. doi: 10.1523/JNEUROSCI.1119-24.2025.

Sensation seeking and risk adjustment: the role of reward sensitivity in dynamic risky decisions.寻求刺激与风险调整：奖励敏感性在动态风险决策中的作用。

Front Behav Neurosci. 2025 Feb 7;19:1492312. doi: 10.3389/fnbeh.2025.1492312. eCollection 2025.

Theta and beta power in the subthalamic nucleus responds to conflict across subregions and hemispheres.底丘脑核中的θ波和β波功率对跨亚区域和半球的冲突做出反应。

Brain Commun. 2025 Jan 16;7(1):fcaf021. doi: 10.1093/braincomms/fcaf021. eCollection 2025.

Basal ganglia components have distinct computational roles in decision-making dynamics under conflict and uncertainty.基底神经节组件在冲突和不确定性下的决策动态中具有不同的计算作用。

PLoS Biol. 2025 Jan 23;23(1):e3002978. doi: 10.1371/journal.pbio.3002978. eCollection 2025 Jan.

Broadscale dampening of uncertainty adjustment in the aging brain.衰老大脑中不确定性调整的广泛抑制。

Nat Commun. 2024 Dec 23;15(1):10717. doi: 10.1038/s41467-024-55416-2.

Subthalamic stimulation causally modulates human voluntary decision-making to stay or go.丘脑底核刺激因果性地调节人类的“留下”或“离开”的自愿决策。

NPJ Parkinsons Dis. 2024 Nov 2;10(1):210. doi: 10.1038/s41531-024-00807-x.

本文引用的文献

Midline frontal cortex low-frequency activity drives subthalamic nucleus oscillations during conflict.中线额皮质的低频活动在冲突期间驱动丘脑底核的振荡。

J Neurosci. 2014 May 21;34(21):7322-33. doi: 10.1523/JNEUROSCI.1169-14.2014.

Frontal theta as a mechanism for cognitive control.额叶θ波作为认知控制的一种机制。

Trends Cogn Sci. 2014 Aug;18(8):414-21. doi: 10.1016/j.tics.2014.04.012. Epub 2014 May 15.

Frontal midline theta reflects anxiety and cognitive control: meta-analytic evidence.额中线θ波反映焦虑和认知控制：荟萃分析证据。

J Physiol Paris. 2015 Feb-Jun;109(1-3):3-15. doi: 10.1016/j.jphysparis.2014.04.003. Epub 2014 Apr 29.

Eye tracking and pupillometry are indicators of dissociable latent decision processes.眼动追踪和瞳孔测量是可分离的潜在决策过程的指标。

J Exp Psychol Gen. 2014 Aug;143(4):1476-88. doi: 10.1037/a0035813. Epub 2014 Feb 17.

HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python.HDDM：在 Python 中对 Drift-Diffusion 模型进行层次贝叶斯估计。

Front Neuroinform. 2013 Aug 2;7:14. doi: 10.3389/fninf.2013.00014. eCollection 2013.

Reduction of influence of task difficulty on perceptual decision making by STN deep brain stimulation.通过丘脑底核深部脑刺激降低任务难度对知觉决策的影响。

Curr Biol. 2013 Sep 9;23(17):1681-4. doi: 10.1016/j.cub.2013.07.001. Epub 2013 Aug 8.

Frontal theta overrides pavlovian learning biases.额部θ波抑制巴甫洛夫式学习偏向。

J Neurosci. 2013 May 8;33(19):8541-8. doi: 10.1523/JNEUROSCI.5754-12.2013.

A computational model of inhibitory control in frontal cortex and basal ganglia.前额皮质和基底神经节抑制控制的计算模型。

Psychol Rev. 2013 Apr;120(2):329-55. doi: 10.1037/a0031542.

Pre-SMA actively engages in conflict processing in human: a combined study of epicortical ERPs and direct cortical stimulation.扣带前回在人类冲突处理中活跃：脑皮质 ERP 和直接皮质刺激的联合研究。

Neuropsychologia. 2013 Apr;51(5):1011-7. doi: 10.1016/j.neuropsychologia.2013.02.002. Epub 2013 Feb 11.

A method for event-related phase/amplitude coupling.一种事件相关相位/幅度耦合的方法。

Neuroimage. 2013 Jan 1;64:416-24. doi: 10.1016/j.neuroimage.2012.09.023. Epub 2012 Sep 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

人类强化学习过程中动态决策参数的功能磁共振成像和脑电图预测指标

fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning.

作者信息

Frank Michael J, Gagne Chris, Nyhus Erika, Masters Sean, Wiecki Thomas V, Cavanagh James F, Badre David

机构信息

Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, Rhode Island 02912.

出版信息

J Neurosci. 2015 Jan 14;35(2):485-94. doi: 10.1523/JNEUROSCI.2036-14.2015.

DOI:10.1523/JNEUROSCI.2036-14.2015

PMID:25589744

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4293405/

Abstract

摘要

人类强化学习过程中动态决策参数的功能磁共振成像和脑电图预测指标

fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

人类强化学习过程中动态决策参数的功能磁共振成像和脑电图预测指标

fMRI and EEG predictors of dynamic decision parameters during human reinforcement learning.

作者信息

机构信息

出版信息