• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

想象强化学习:计算原理与神经机制。

Imaginative Reinforcement Learning: Computational Principles and Neural Mechanisms.

机构信息

Harvard University.

出版信息

J Cogn Neurosci. 2017 Dec;29(12):2103-2113. doi: 10.1162/jocn_a_01170. Epub 2017 Jul 14.

DOI:10.1162/jocn_a_01170
PMID:28707569
Abstract

Imagination enables us not only to transcend reality but also to learn about it. In the context of reinforcement learning, an agent can rationally update its value estimates by simulating an internal model of the environment, provided that the model is accurate. In a series of sequential decision-making experiments, we investigated the impact of imaginative simulation on subsequent decisions. We found that imagination can cause people to pursue imagined paths, even when these paths are suboptimal. This bias is systematically related to participants' optimism about how much reward they expect to receive along imagined paths; providing feedback strongly attenuates the effect. The imagination effect can be captured by a reinforcement learning model that includes a bonus added onto imagined rewards. Using fMRI, we show that a network of regions associated with valuation is predictive of the imagination effect. These results suggest that imagination, although a powerful tool for learning, is also susceptible to motivational biases.

摘要

想象使我们不仅能够超越现实,还能够了解现实。在强化学习的背景下,只要模型准确,代理就可以通过模拟环境的内部模型来合理地更新其价值估计。在一系列连续决策实验中,我们研究了想象模拟对后续决策的影响。我们发现,想象可以导致人们追求想象中的路径,即使这些路径不是最优的。这种偏差与参与者对沿着想象中的路径预期获得多少奖励的乐观程度有系统的关系;提供反馈强烈地减弱了这种影响。想象效应可以通过强化学习模型来捕捉,该模型包括对想象中的奖励的附加奖励。我们使用 fMRI 显示,与估值相关的区域网络可以预测想象效应。这些结果表明,想象虽然是学习的有力工具,但也容易受到动机偏差的影响。

相似文献

1
Imaginative Reinforcement Learning: Computational Principles and Neural Mechanisms.想象强化学习:计算原理与神经机制。
J Cogn Neurosci. 2017 Dec;29(12):2103-2113. doi: 10.1162/jocn_a_01170. Epub 2017 Jul 14.
2
Enhanced Neural Responses to Imagined Primary Rewards Predict Reduced Monetary Temporal Discounting.对想象中的主要奖励的增强神经反应预示着货币时间折扣的降低。
J Neurosci. 2015 Sep 23;35(38):13103-9. doi: 10.1523/JNEUROSCI.1863-15.2015.
3
How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策:强化学习预测错误在人类中的快速传播。
J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.
4
Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments.多维环境中强化学习与注意力之间的动态交互
Neuron. 2017 Jan 18;93(2):451-463. doi: 10.1016/j.neuron.2016.12.040.
5
Neural basis of decision making guided by emotional outcomes.由情感结果引导的决策的神经基础。
J Neurophysiol. 2015 May 1;113(9):3056-68. doi: 10.1152/jn.00564.2014. Epub 2015 Feb 18.
6
Behavioral and neural predictors of upcoming decisions.即将做出决策的行为和神经预测因素。
Cogn Affect Behav Neurosci. 2005 Jun;5(2):117-26. doi: 10.3758/cabn.5.2.117.
7
Neural evidence for adaptive strategy selection in value-based decision-making.基于价值的决策中适应性策略选择的神经证据。
Cereb Cortex. 2014 Aug;24(8):2009-21. doi: 10.1093/cercor/bht049. Epub 2013 Mar 8.
8
CREB1 Genotype Modulates Adaptive Reward-Based Decisions in Humans.CREB1基因分型调节人类基于奖励的适应性决策。
Cereb Cortex. 2016 Jul;26(7):2970-81. doi: 10.1093/cercor/bhv104. Epub 2015 Jun 4.
9
Reward and Novelty Enhance Imagination of Future Events in a Motivational-Episodic Network.奖励与新奇感在动机性情景网络中增强对未来事件的想象。
PLoS One. 2015 Nov 23;10(11):e0143477. doi: 10.1371/journal.pone.0143477. eCollection 2015.
10
Neural systems for choice and valuation with counterfactual learning signals.具有反事实学习信号的选择和估值的神经系统。
Neuroimage. 2014 Apr 1;89:57-69. doi: 10.1016/j.neuroimage.2013.11.051. Epub 2013 Dec 7.

引用本文的文献

1
Using computational models of learning to advance cognitive behavioral therapy.利用学习的计算模型推进认知行为疗法。
Commun Psychol. 2025 Apr 27;3(1):72. doi: 10.1038/s44271-025-00251-4.
2
A recurrent network model of planning explains hippocampal replay and human behavior.一种规划的循环网络模型解释了海马体重放和人类行为。
Nat Neurosci. 2024 Jul;27(7):1340-1348. doi: 10.1038/s41593-024-01675-7. Epub 2024 Jun 7.
3
The Generative Adversarial Brain.生成对抗性大脑
Front Artif Intell. 2019 Sep 18;2:18. doi: 10.3389/frai.2019.00018. eCollection 2019.
4
Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning.快速试错学习与模拟支持灵活的工具使用和物理推理。
Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29302-29310. doi: 10.1073/pnas.1912341117.
5
Model-free and model-based learning processes in the updating of explicit and implicit evaluations.无模型和基于模型的学习过程在显式和隐式评价的更新中。
Proc Natl Acad Sci U S A. 2019 Mar 26;116(13):6035-6044. doi: 10.1073/pnas.1820238116. Epub 2019 Mar 12.