• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

脑电中的无模型和基于模型的奖励预测误差。

Model-free and model-based reward prediction errors in EEG.

机构信息

School of Psychology, University of East Anglia, United Kingdom.

Cognition Institute, School of Psychology, University of Plymouth, United Kingdom.

出版信息

Neuroimage. 2018 Sep;178:162-171. doi: 10.1016/j.neuroimage.2018.05.023. Epub 2018 May 24.

DOI:10.1016/j.neuroimage.2018.05.023
PMID:29758337
Abstract

Learning theorists posit two reinforcement learning systems: model-free and model-based. Model-based learning incorporates knowledge about structure and contingencies in the world to assign candidate actions with an expected value. Model-free learning is ignorant of the world's structure; instead, actions hold a value based on prior reinforcement, with this value updated by expectancy violation in the form of a reward prediction error. Because they use such different learning mechanisms, it has been previously assumed that model-based and model-free learning are computationally dissociated in the brain. However, recent fMRI evidence suggests that the brain may compute reward prediction errors to both model-free and model-based estimates of value, signalling the possibility that these systems interact. Because of its poor temporal resolution, fMRI risks confounding reward prediction errors with other feedback-related neural activity. In the present study, EEG was used to show the presence of both model-based and model-free reward prediction errors and their place in a temporal sequence of events including state prediction errors and action value updates. This demonstration of model-based prediction errors questions a long-held assumption that model-free and model-based learning are dissociated in the brain.

摘要

学习理论家假设存在两种强化学习系统

无模型和基于模型。基于模型的学习结合了对世界结构和规律的知识,为候选动作分配具有预期值的奖励。无模型学习则忽略了世界的结构;相反,动作的价值是基于先前的强化,这种价值通过奖励预测误差的形式更新。由于它们使用了不同的学习机制,之前人们假设基于模型和无模型的学习在大脑中是计算分离的。然而,最近的 fMRI 证据表明,大脑可能会对无模型和基于模型的价值估计计算奖励预测误差,这表明这些系统可能相互作用。由于 fMRI 的时间分辨率较差,它有可能将奖励预测误差与其他与反馈相关的神经活动混淆。在本研究中,我们使用 EEG 来展示基于模型和无模型奖励预测误差的存在,以及它们在包括状态预测误差和动作价值更新的时间序列事件中的位置。这一基于模型的预测误差的证明对长期以来的假设提出了质疑,即无模型和基于模型的学习在大脑中是分离的。

相似文献

1
Model-free and model-based reward prediction errors in EEG.脑电中的无模型和基于模型的奖励预测误差。
Neuroimage. 2018 Sep;178:162-171. doi: 10.1016/j.neuroimage.2018.05.023. Epub 2018 May 24.
2
Reward expectation and prediction error in human medial frontal cortex: an EEG study.人类内侧前额叶皮质中的奖励期望与预测误差:一项脑电图研究。
Neuroimage. 2014 Jan 1;84:376-82. doi: 10.1016/j.neuroimage.2013.08.058. Epub 2013 Sep 2.
3
Oscillatory signatures of reward prediction errors in declarative learning.陈述性学习中奖励预测误差的振荡特征。
Neuroimage. 2019 Feb 1;186:137-145. doi: 10.1016/j.neuroimage.2018.10.083. Epub 2018 Nov 2.
4
When the outcome is different than expected: Subjective expectancy shapes reward prediction error at the FRN level.当结果与预期不同时:主观期望在 FRN 水平上塑造了奖励预测误差。
Psychophysiology. 2019 Dec;56(12):e13456. doi: 10.1111/psyp.13456. Epub 2019 Aug 12.
5
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
6
Perceptual Salience and Reward Both Influence Feedback-Related Neural Activity Arising from Choice.知觉显著性和奖励都会影响因选择而产生的与反馈相关的神经活动。
J Neurosci. 2015 Sep 23;35(38):13064-75. doi: 10.1523/JNEUROSCI.1601-15.2015.
7
Dissociable effects of reward and expectancy during evaluative feedback processing revealed by topographic ERP mapping analysis.通过地形图 ERP 映射分析揭示评价性反馈处理过程中奖励和期望的可分离效应。
Int J Psychophysiol. 2018 Oct;132(Pt B):213-225. doi: 10.1016/j.ijpsycho.2017.11.013. Epub 2017 Nov 24.
8
Cortical delta activity reflects reward prediction error and related behavioral adjustments, but at different times.皮质δ活动反映奖励预测误差及相关行为调整,但时间不同。
Neuroimage. 2015 Apr 15;110:205-16. doi: 10.1016/j.neuroimage.2015.02.007. Epub 2015 Feb 10.
9
State anxiety alters the neural oscillatory correlates of predictions and prediction errors during reward-based learning.状态焦虑改变了基于奖励的学习过程中预测和预测误差的神经振荡相关性。
Neuroimage. 2022 Apr 1;249:118895. doi: 10.1016/j.neuroimage.2022.118895. Epub 2022 Jan 10.
10
Dissociating contributions of ACC and vmPFC in reward prediction, outcome, and choice.区分前扣带回皮质和腹内侧前额叶皮质在奖励预测、结果及选择中的作用。
Neuropsychologia. 2014 Jul;59:112-23. doi: 10.1016/j.neuropsychologia.2014.04.019. Epub 2014 May 9.

引用本文的文献

1
Global neural encoding of behavioral strategies in mice during perceptual decision-making task with two different sensory patterns.在具有两种不同感觉模式的感知决策任务中,小鼠行为策略的全局神经编码。
iScience. 2024 Oct 16;27(11):111182. doi: 10.1016/j.isci.2024.111182. eCollection 2024 Nov 15.
2
Surprise-minimization as a solution to the structural credit assignment problem.将惊喜最小化作为解决结构性信用分配问题的一种方法。
PLoS Comput Biol. 2024 May 28;20(5):e1012175. doi: 10.1371/journal.pcbi.1012175. eCollection 2024 May.
3
Neurocognitive reward processes measured via event-related potentials are associated with binge-eating disorder diagnosis and ecologically-assessed behavior.
通过事件相关电位测量的神经认知奖励过程与暴食障碍诊断和生态评估行为相关。
Appetite. 2024 Feb 1;193:107151. doi: 10.1016/j.appet.2023.107151. Epub 2023 Dec 6.
4
The potential application of event-related potentials to enhance research on reward processes in eating disorders.事件相关电位在增强进食障碍中奖励加工研究中的潜在应用。
Int J Eat Disord. 2022 Nov;55(11):1484-1495. doi: 10.1002/eat.23821. Epub 2022 Oct 10.
5
Behavioral construction of the future.未来的行为构建。
Psychol Addict Behav. 2023 Feb;37(1):13-24. doi: 10.1037/adb0000853. Epub 2022 Jun 27.
6
Model-based learning retrospectively updates model-free values.基于模型的学习会对无模型值进行回顾性更新。
Sci Rep. 2022 Feb 11;12(1):2358. doi: 10.1038/s41598-022-05567-3.
7
Reward prediction error in the ERP following unconditioned aversive stimuli.在非条件性厌恶刺激后,ERP 中出现的奖励预测误差。
Sci Rep. 2021 Oct 7;11(1):19912. doi: 10.1038/s41598-021-99408-4.
8
Model-Based Planning Deficits in Compulsivity Are Linked to Faulty Neural Representations of Task Structure.基于模型的强迫性计划缺陷与任务结构的神经表示错误有关。
J Neurosci. 2021 Jul 28;41(30):6539-6550. doi: 10.1523/JNEUROSCI.0031-21.2021. Epub 2021 Jun 15.
9
Willpower with and without effort.意志力:有无付出努力的区别。
Behav Brain Sci. 2020 Aug 26;44:e30. doi: 10.1017/S0140525X20000357.
10
The influence of internal models on feedback-related brain activity.内部模型对反馈相关脑活动的影响。
Cogn Affect Behav Neurosci. 2020 Oct;20(5):1070-1089. doi: 10.3758/s13415-020-00820-6.