Suppr超能文献

人类在奖励学习过程中预测误差效价和惊喜的时空神经特征。

Spatiotemporal neural characterization of prediction error valence and surprise during reward learning in humans.

机构信息

Institute of Neuroscience & Psychology, University of Glasgow, Glasgow, UK.

Department of Experimental Psychology, University of Oxford, Oxford, UK.

出版信息

Sci Rep. 2017 Jul 6;7(1):4762. doi: 10.1038/s41598-017-04507-w.

Abstract

Reward learning depends on accurate reward associations with potential choices. These associations can be attained with reinforcement learning mechanisms using a reward prediction error (RPE) signal (the difference between actual and expected rewards) for updating future reward expectations. Despite an extensive body of literature on the influence of RPE on learning, little has been done to investigate the potentially separate contributions of RPE valence (positive or negative) and surprise (absolute degree of deviation from expectations). Here, we coupled single-trial electroencephalography with simultaneously acquired fMRI, during a probabilistic reversal-learning task, to offer evidence of temporally overlapping but largely distinct spatial representations of RPE valence and surprise. Electrophysiological variability in RPE valence correlated with activity in regions of the human reward network promoting approach or avoidance learning. Electrophysiological variability in RPE surprise correlated primarily with activity in regions of the human attentional network controlling the speed of learning. Crucially, despite the largely separate spatial extend of these representations our EEG-informed fMRI approach uniquely revealed a linear superposition of the two RPE components in a smaller network encompassing visuo-mnemonic and reward areas. Activity in this network was further predictive of stimulus value updating indicating a comparable contribution of both signals to reward learning.

摘要

奖励学习取决于将潜在选择与准确的奖励关联起来。这些关联可以通过使用奖励预测误差(RPE)信号(实际和预期奖励之间的差异)来更新未来奖励预期的强化学习机制来实现。尽管有大量关于 RPE 对学习影响的文献,但很少有研究探讨 RPE 效价(正或负)和意外(与预期的偏离程度)的潜在独立贡献。在这里,我们在概率反转学习任务中结合了单次脑电图和同时获取的 fMRI,为 RPE 效价和意外的时间重叠但在很大程度上是不同的空间表示提供了证据。RPE 效价的电生理变异性与促进接近或回避学习的人类奖励网络区域的活动相关。RPE 意外的电生理变异性主要与控制学习速度的人类注意力网络区域的活动相关。至关重要的是,尽管这些表示的空间范围在很大程度上是分开的,但我们的 EEG 启发式 fMRI 方法独特地揭示了两个 RPE 成分在包含视觉记忆和奖励区域的较小网络中的线性叠加。该网络的活动进一步预测了刺激价值更新,表明这两个信号对奖励学习的贡献相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e099/5500565/99eb5f995196/41598_2017_4507_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验