• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

强化学习的内外情境:注意力焦点的影响。

Reinforcement learning in and out of context: The effects of attentional focus.

机构信息

Department of Psychology, University of South Carolina.

出版信息

J Exp Psychol Learn Mem Cogn. 2023 Aug;49(8):1193-1217. doi: 10.1037/xlm0001145. Epub 2022 Jul 4.

DOI:10.1037/xlm0001145
PMID:35787139
Abstract

In reinforcement learning (RL) tasks, decision makers learn the values of actions in a context-dependent fashion. Although context dependence has many advantages, it can lead to suboptimal preferences when choice options are extrapolated beyond their original encoding contexts. Here, we tested whether we could manipulate context dependence in RL by introducing a secondary task designed to bias attention toward either absolute or relative outcomes. Participants completed a learning phase that involved choices between two (Experiment 1; = 111) or three (Experiment 2; = 90) options per trial with complete feedback. Choice options were grouped in stable contexts so that only a small set of the possible combinations were encountered. One group of participants rated how they felt about particular options (Feelings condition), and another group reported how much they expected to win from particular options (Outcomes condition) at occasional points throughout the learning phase. A third group (Control condition) made no ratings. In the subsequent transfer test, participants chose between all possible pairs of options without feedback. The experimental manipulation had no effect on learning phase performance but a significant effect on transfer, with the Feelings and Control conditions exhibiting greater context dependence than the Outcomes condition. Further, rated feelings reflected relative valuation whereas expected outcomes were more sensitive to absolute option values. Hierarchical Bayesian modeling was used to summarize the findings from both experiments. Our results suggest that attending to affective reactions versus expected outcomes moderates the effects of encoding context on subsequent choices. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

摘要

在强化学习(RL)任务中,决策者以依赖上下文的方式学习动作的价值。尽管上下文依赖性有许多优点,但当选择选项超出其原始编码上下文进行推断时,它可能导致次优偏好。在这里,我们通过引入一项旨在偏向绝对或相对结果的辅助任务来测试我们是否可以在 RL 中操纵上下文依赖性。参与者完成了一个学习阶段,该阶段涉及每次试验在两个(实验 1;n = 111)或三个(实验 2;n = 90)选项之间进行选择,并提供完整反馈。选择选项按稳定的上下文分组,因此只遇到一小部分可能的组合。一组参与者对特定选项的感觉进行了评分(感觉条件),另一组参与者在学习阶段的偶尔点报告他们期望从特定选项中赢得多少(结果条件)。第三组(对照组)没有评分。在随后的转移测试中,参与者在没有反馈的情况下选择所有可能的选项对。实验操作对学习阶段的表现没有影响,但对转移有显著影响,感觉条件和对照组比结果条件表现出更大的上下文依赖性。此外,评分的感觉反映了相对估值,而预期的结果对绝对选项值更为敏感。分层贝叶斯模型用于总结两个实验的结果。我们的研究结果表明,关注情感反应与预期结果会调节编码上下文对后续选择的影响。(PsycInfo 数据库记录(c)2023 APA,保留所有权利)。

相似文献

1
Reinforcement learning in and out of context: The effects of attentional focus.强化学习的内外情境:注意力焦点的影响。
J Exp Psychol Learn Mem Cogn. 2023 Aug;49(8):1193-1217. doi: 10.1037/xlm0001145. Epub 2022 Jul 4.
2
Autonomic responses to choice outcomes: Links to task performance and reinforcement-learning parameters.自主反应对选择结果的影响:与任务表现和强化学习参数的关联。
Biol Psychol. 2020 Oct;156:107968. doi: 10.1016/j.biopsycho.2020.107968. Epub 2020 Oct 4.
3
Decomposing the effects of context valence and feedback information on speed and accuracy during reinforcement learning: a meta-analytical approach using diffusion decision modeling.使用扩散决策模型对强化学习过程中上下文效价和反馈信息对速度和准确性的影响进行分解:一项元分析方法。
Cogn Affect Behav Neurosci. 2019 Jun;19(3):490-502. doi: 10.3758/s13415-019-00723-1.
4
Linking confidence biases to reinforcement-learning processes.将置信偏差与强化学习过程联系起来。
Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.
5
Intact Reinforcement Learning But Impaired Attentional Control During Multidimensional Probabilistic Learning in Older Adults.老年人在多维概率学习中表现出完整的强化学习能力但注意力控制受损。
J Neurosci. 2020 Jan 29;40(5):1084-1096. doi: 10.1523/JNEUROSCI.0254-19.2019. Epub 2019 Dec 11.
6
Computational evidence for hierarchically structured reinforcement learning in humans.人类强化学习的分层结构计算证据。
Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29381-29389. doi: 10.1073/pnas.1912330117.
7
Temporal and state abstractions for efficient learning, transfer, and composition in humans.人类高效学习、迁移和组合的时间和状态抽象。
Psychol Rev. 2021 Jul;128(4):643-666. doi: 10.1037/rev0000295. Epub 2021 May 20.
8
The Effect of Counterfactual Information on Outcome Value Coding in Medial Prefrontal and Cingulate Cortex: From an Absolute to a Relative Neural Code.反事实信息对内侧前额叶和扣带回皮层结果价值编码的影响:从绝对神经编码到相对神经编码。
J Neurosci. 2020 Apr 15;40(16):3268-3277. doi: 10.1523/JNEUROSCI.1712-19.2020. Epub 2020 Mar 10.
9
Frequency effects in action versus value learning.动作学习与价值学习中的频率效应。
J Exp Psychol Learn Mem Cogn. 2022 Sep;48(9):1311-1327. doi: 10.1037/xlm0000896. Epub 2021 Apr 19.
10
Generalization of value in reinforcement learning by humans.人类在强化学习中的价值泛化。
Eur J Neurosci. 2012 Apr;35(7):1092-104. doi: 10.1111/j.1460-9568.2012.08017.x.

引用本文的文献

1
Relative Value Encoding in Large Language Models: A Multi-Task, Multi-Model Investigation.大语言模型中的相对价值编码:多任务、多模型研究
Open Mind (Camb). 2025 May 9;9:709-725. doi: 10.1162/opmi_a_00209. eCollection 2025.
2
Comparing experience- and description-based economic preferences across 11 countries.比较 11 个国家基于经验和描述的经济偏好。
Nat Hum Behav. 2024 Aug;8(8):1554-1567. doi: 10.1038/s41562-024-01894-9. Epub 2024 Jun 14.
3
Intrinsic rewards explain context-sensitive valuation in reinforcement learning.
内在奖励解释了强化学习中的情境敏感估值。
PLoS Biol. 2023 Jul 17;21(7):e3002201. doi: 10.1371/journal.pbio.3002201. eCollection 2023 Jul.
4
The functional form of value normalization in human reinforcement learning.人类强化学习中的价值归一化的函数形式。
Elife. 2023 Jul 10;12:e83891. doi: 10.7554/eLife.83891.
5
Outcome context-dependence is not WEIRD: Comparing reinforcement- and description-based economic preferences worldwide.结果情境依赖性并非怪异现象:比较全球基于强化和描述的经济偏好。
Res Sq. 2023 Mar 2:rs.3.rs-2621222. doi: 10.21203/rs.3.rs-2621222/v1.
6
Training diversity promotes absolute-value-guided choice.训练多样性促进绝对值引导选择。
PLoS Comput Biol. 2022 Nov 2;18(11):e1010664. doi: 10.1371/journal.pcbi.1010664. eCollection 2022 Nov.