• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

隐藏奖励:情感及其预测误差作为主观价值的窗口

Hidden Reward: Affect and Its Prediction Errors as Windows Into Subjective Value.

作者信息

Vollberg Marius C, Sander David

机构信息

Department of Psychology, University of Amsterdam.

Swiss Center for Affective Sciences, University of Geneva.

出版信息

Curr Dir Psychol Sci. 2024 Apr;33(2):93-99. doi: 10.1177/09637214231217678. Epub 2024 Jan 19.

DOI:10.1177/09637214231217678
PMID:38562909
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10981566/
Abstract

Scientists increasingly apply concepts from reinforcement learning to affect, but which concepts should apply? And what can their application reveal that we cannot know from directly observable states? An important reinforcement learning concept is the difference between reward expectations and outcomes. Such reward prediction errors have become foundational to research on adaptive behavior in humans, animals, and machines. Owing to historical focus on animal models and observable reward (e.g., food or money), however, relatively little attention has been paid to the fact that humans can additionally report correspondingly expected and experienced affect (e.g., feelings). Reflecting a broader "rise of affectivism," attention has started to shift, revealing explanatory power of expected and experienced feelings-including prediction errors-above and beyond observable reward. We propose that applying concepts from reinforcement learning to affect holds promise for elucidating subjective value. Simultaneously, we urge scientists to test-rather than inherit-concepts that may not apply directly.

摘要

科学家们越来越多地将强化学习的概念应用于情感,但应该应用哪些概念呢?它们的应用又能揭示哪些我们无法从直接观察到的状态中得知的信息呢?一个重要的强化学习概念是奖励期望与结果之间的差异。这种奖励预测误差已成为人类、动物和机器适应性行为研究的基础。然而,由于历史上对动物模型和可观察奖励(如食物或金钱)的关注,相对较少有人关注人类还能报告相应的预期和体验到的情感(如感觉)这一事实。反映出更广泛的“情感主义兴起”,注意力开始转移,揭示了预期和体验到的情感(包括预测误差)在可观察奖励之上的解释力。我们认为,将强化学习的概念应用于情感有望阐明主观价值。同时,我们敦促科学家对可能无法直接应用的概念进行测试而非继承。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e54f/10981566/bcef9d56eee9/10.1177_09637214231217678-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e54f/10981566/bcef9d56eee9/10.1177_09637214231217678-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e54f/10981566/bcef9d56eee9/10.1177_09637214231217678-fig1.jpg

相似文献

1
Hidden Reward: Affect and Its Prediction Errors as Windows Into Subjective Value.隐藏奖励:情感及其预测误差作为主观价值的窗口
Curr Dir Psychol Sci. 2024 Apr;33(2):93-99. doi: 10.1177/09637214231217678. Epub 2024 Jan 19.
2
Momentary subjective well-being depends on learning and not reward.瞬间主观幸福感取决于学习而非奖励。
Elife. 2020 Nov 17;9:e57977. doi: 10.7554/eLife.57977.
3
Nutrient-Sensitive Reinforcement Learning in Monkeys.猴子的营养敏感强化学习。
J Neurosci. 2023 Mar 8;43(10):1714-1730. doi: 10.1523/JNEUROSCI.0752-22.2022. Epub 2023 Jan 20.
4
The effect of reward prediction errors on subjective affect depends on outcome valence and decision context.奖励预测误差对主观情感的影响取决于结果效价和决策背景。
Emotion. 2024 Apr;24(3):894-911. doi: 10.1037/emo0001310. Epub 2023 Nov 13.
5
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
6
Reward prediction errors, not sensory prediction errors, play a major role in model selection in human reinforcement learning.奖励预测误差,而不是感觉预测误差,在人类强化学习中的模型选择中起着主要作用。
Neural Netw. 2022 Oct;154:109-121. doi: 10.1016/j.neunet.2022.07.002. Epub 2022 Jul 13.
7
Perceptual Salience and Reward Both Influence Feedback-Related Neural Activity Arising from Choice.知觉显著性和奖励都会影响因选择而产生的与反馈相关的神经活动。
J Neurosci. 2015 Sep 23;35(38):13064-75. doi: 10.1523/JNEUROSCI.1601-15.2015.
8
Credit Assignment in a Motor Decision Making Task Is Influenced by Agency and Not Sensory Prediction Errors.在一项运动决策任务中,信用分配受机构影响,而不受感官预测误差影响。
J Neurosci. 2018 May 9;38(19):4521-4530. doi: 10.1523/JNEUROSCI.3601-17.2018. Epub 2018 Apr 12.
9
Depressive symptoms bias the prediction-error enhancement of memory towards negative events in reinforcement learning.抑郁症状会使强化学习中对记忆的预测误差增强偏向于负面事件。
Psychopharmacology (Berl). 2019 Aug;236(8):2425-2435. doi: 10.1007/s00213-019-05322-z. Epub 2019 Jul 26.
10
The value of confidence: Confidence prediction errors drive value-based learning in the absence of external feedback.信心的价值:在缺乏外部反馈的情况下,信心预测误差驱动基于价值的学习。
PLoS Comput Biol. 2022 Oct 3;18(10):e1010580. doi: 10.1371/journal.pcbi.1010580. eCollection 2022 Oct.

引用本文的文献

1
Self-utility distance as a computational approach to understanding self-concept clarity.自我效用距离作为一种理解自我概念清晰度的计算方法。
Commun Psychol. 2025 Mar 25;3(1):50. doi: 10.1038/s44271-025-00231-8.

本文引用的文献

1
Affective prediction errors in persistence and escalation of aggression.攻击行为持续和升级中的情感预测误差。
J Exp Psychol Gen. 2024 Jun;153(6):1551-1567. doi: 10.1037/xge0001570. Epub 2024 May 2.
2
Bayesianism and wishful thinking are compatible.贝叶斯主义和一厢情愿是兼容的。
Nat Hum Behav. 2024 Apr;8(4):692-701. doi: 10.1038/s41562-024-01819-6. Epub 2024 Feb 23.
3
A probabilistic map of emotional experiences during competitive social interactions.竞争社会互动中情绪体验的概率图谱。
Nat Commun. 2022 Mar 31;13(1):1718. doi: 10.1038/s41467-022-29372-8.
4
Differential Contributions of Ventral Striatum Subregions to the Motivational and Hedonic Components of the Affective Processing of Reward.腹侧纹状体亚区对奖赏的情感加工的动机和享乐成分的差异贡献。
J Neurosci. 2022 Mar 30;42(13):2716-2728. doi: 10.1523/JNEUROSCI.1124-21.2022. Epub 2022 Feb 11.
5
"Liking" as an early and editable draft of long-run affective value.“喜欢”作为长期情感价值的早期和可编辑草稿。
PLoS Biol. 2022 Jan 5;20(1):e3001476. doi: 10.1371/journal.pbio.3001476. eCollection 2022 Jan.
6
Emotion prediction errors guide socially adaptive behaviour.情绪预测误差指导社会适应性行为。
Nat Hum Behav. 2021 Oct;5(10):1391-1401. doi: 10.1038/s41562-021-01213-6. Epub 2021 Oct 19.
7
A Neurocomputational Model for Intrinsic Reward.一个内在奖励的神经计算模型。
J Neurosci. 2021 Oct 27;41(43):8963-8971. doi: 10.1523/JNEUROSCI.0858-20.2021. Epub 2021 Sep 20.
8
The rise of affectivism.情感主义的兴起。
Nat Hum Behav. 2021 Jul;5(7):816-820. doi: 10.1038/s41562-021-01130-8.
9
A computational reward learning account of social media engagement.社交媒体参与的计算奖励学习解释。
Nat Commun. 2021 Feb 26;12(1):1311. doi: 10.1038/s41467-020-19607-x.
10
Expected Valence Predicts Choice in a Recurrent Decision Task.预期效价预测循环决策任务中的选择。
Front Neurosci. 2020 Nov 26;14:580970. doi: 10.3389/fnins.2020.580970. eCollection 2020.