• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

顿悟学习的计算建模

Computational modeling of epiphany learning.

作者信息

Chen Wei James, Krajbich Ian

机构信息

Department of Economics, The Ohio State University, Columbus, OH 43210.

Department of Economics, The Ohio State University, Columbus, OH 43210;

出版信息

Proc Natl Acad Sci U S A. 2017 May 2;114(18):4637-4642. doi: 10.1073/pnas.1618161114. Epub 2017 Apr 17.

DOI:10.1073/pnas.1618161114
PMID:28416682
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5422778/
Abstract

Models of reinforcement learning (RL) are prevalent in the decision-making literature, but not all behavior seems to conform to the gradual convergence that is a central feature of RL. In some cases learning seems to happen all at once. Limited prior research on these "epiphanies" has shown evidence of sudden changes in behavior, but it remains unclear how such epiphanies occur. We propose a sequential-sampling model of epiphany learning (EL) and test it using an eye-tracking experiment. In the experiment, subjects repeatedly play a strategic game that has an optimal strategy. Subjects can learn over time from feedback but are also allowed to commit to a strategy at any time, eliminating all other options and opportunities to learn. We find that the EL model is consistent with the choices, eye movements, and pupillary responses of subjects who commit to the optimal strategy (correct epiphany) but not always of those who commit to a suboptimal strategy or who do not commit at all. Our findings suggest that EL is driven by a latent evidence accumulation process that can be revealed with eye-tracking data.

摘要

强化学习(RL)模型在决策文献中很常见,但并非所有行为似乎都符合RL的核心特征——渐进收敛。在某些情况下,学习似乎是一下子就发生了。此前对这些“顿悟”的有限研究已显示出行为突然变化的证据,但顿悟是如何发生的仍不清楚。我们提出了一种顿悟学习(EL)的序列采样模型,并通过一项眼动追踪实验对其进行测试。在实验中,受试者反复玩一个具有最优策略的策略性游戏。受试者可以随着时间从反馈中学习,但也被允许随时选定一种策略,从而排除所有其他学习选项和机会。我们发现,EL模型与选定最优策略(正确顿悟)的受试者的选择、眼动和瞳孔反应一致,但并非总是与选定次优策略或根本未选定策略的受试者的情况一致。我们的研究结果表明,EL由一个潜在的证据积累过程驱动,该过程可通过眼动追踪数据揭示。

相似文献

1
Computational modeling of epiphany learning.顿悟学习的计算建模
Proc Natl Acad Sci U S A. 2017 May 2;114(18):4637-4642. doi: 10.1073/pnas.1618161114. Epub 2017 Apr 17.
2
Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning.眼动数据揭示了基于模型和无模型强化学习的不同选择过程。
Nat Commun. 2016 Aug 11;7:12438. doi: 10.1038/ncomms12438.
3
Learning From Peers' Eye Movements in the Absence of Expert Guidance: A Proof of Concept Using Laboratory Stock Trading, Eye Tracking, and Machine Learning.在缺乏专家指导的情况下从同行的眼动中学习:使用实验室股票交易、眼动追踪和机器学习的概念验证。
Cogn Sci. 2019 Mar;43(2):e12716. doi: 10.1111/cogs.12716.
4
The actor-critic learning is behind the matching law: matching versus optimal behaviors.行动者-评论家学习是匹配法则背后的原理:匹配行为与最优行为。
Neural Comput. 2008 Jan;20(1):227-51. doi: 10.1162/neco.2008.20.1.227.
5
Stress and strategic decision-making in the beauty contest game.压力与美赛博弈中的策略决策。
Psychoneuroendocrinology. 2013 Sep;38(9):1503-11. doi: 10.1016/j.psyneuen.2012.12.016. Epub 2013 Jan 9.
6
Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters.内侧前额叶皮质与强化学习参数的适应性调节。
Prog Brain Res. 2013;202:441-64. doi: 10.1016/B978-0-444-62604-2.00022-8.
7
Human and machine learning in non-Markovian decision making.非马尔可夫决策中的人类与机器学习
PLoS One. 2015 Apr 21;10(4):e0123105. doi: 10.1371/journal.pone.0123105. eCollection 2015.
8
Model-based reinforcement learning for partially observable games with sampling-based state estimation.基于模型的强化学习在基于采样状态估计的部分可观测博弈中的应用
Neural Comput. 2007 Nov;19(11):3051-87. doi: 10.1162/neco.2007.19.11.3051.
9
Detection of Changes in Surgical Difficulty: Evidence From Pupil Responses.手术难度变化的检测:来自瞳孔反应的证据。
Surg Innov. 2015 Dec;22(6):629-35. doi: 10.1177/1553350615573582. Epub 2015 Mar 9.
10
Reinforcement-based decision making in corticostriatal circuits: mutual constraints by neurocomputational and diffusion models.基于强化的皮质纹状体回路决策:神经计算和扩散模型的相互约束。
Neural Comput. 2012 May;24(5):1186-229. doi: 10.1162/NECO_a_00270. Epub 2012 Feb 1.

引用本文的文献

1
Decomposing loss aversion from a single neural signal.从单个神经信号中分解损失厌恶。
iScience. 2024 Jun 6;27(7):110153. doi: 10.1016/j.isci.2024.110153. eCollection 2024 Jul 19.
2
A Formal Framework for Knowledge Acquisition: Going beyond Machine Learning.知识获取的形式化框架:超越机器学习
Entropy (Basel). 2022 Oct 14;24(10):1469. doi: 10.3390/e24101469.
3
The impact of digital empowerment on open innovation performance of enterprises from the perspective of SOR.基于SOR视角的数字赋能对企业开放式创新绩效的影响
Front Psychol. 2023 Feb 8;14:1109149. doi: 10.3389/fpsyg.2023.1109149. eCollection 2023.
4
Does eye-tracking have an effect on economic behavior?眼动追踪对经济行为有影响吗?
PLoS One. 2021 Aug 5;16(8):e0254867. doi: 10.1371/journal.pone.0254867. eCollection 2021.
5
Adaptive learning under expected and unexpected uncertainty.在预期和意外不确定性下的自适应学习。
Nat Rev Neurosci. 2019 Oct;20(10):635-644. doi: 10.1038/s41583-019-0180-y.

本文引用的文献

1
Eye Movements in Strategic Choice.战略选择中的眼动
J Behav Decis Mak. 2016 Apr-Jul;29(2-3):137-156. doi: 10.1002/bdm.1901. Epub 2015 Oct 29.
2
Gaze data reveal distinct choice processes underlying model-based and model-free reinforcement learning.眼动数据揭示了基于模型和无模型强化学习的不同选择过程。
Nat Commun. 2016 Aug 11;7:12438. doi: 10.1038/ncomms12438.
3
Adaptive gain control during human perceptual choice.人类感知选择过程中的自适应增益控制。
Neuron. 2014 Mar 19;81(6):1429-1441. doi: 10.1016/j.neuron.2014.01.020.
4
Eye tracking and pupillometry are indicators of dissociable latent decision processes.眼动追踪和瞳孔测量是可分离的潜在决策过程的指标。
J Exp Psychol Gen. 2014 Aug;143(4):1476-88. doi: 10.1037/a0035813. Epub 2014 Feb 17.
5
Revisiting the learning curve (once again).重温学习曲线(再一次)。
Front Psychol. 2013 Dec 26;4:982. doi: 10.3389/fpsyg.2013.00982. eCollection 2013.
6
S-shaped learning curves.S形学习曲线。
Psychon Bull Rev. 2014 Apr;21(2):344-56. doi: 10.3758/s13423-013-0522-0.
7
Simultaneous modeling of visual saliency and value computation improves predictions of economic choice.同时建模视觉显著性和价值计算可提高经济选择的预测。
Proc Natl Acad Sci U S A. 2013 Oct 1;110(40):E3858-67. doi: 10.1073/pnas.1304429110. Epub 2013 Sep 9.
8
Dissociable effects of surprise and model update in parietal and anterior cingulate cortex.顶叶和扣带回前部皮质中惊讶和模型更新的可分离效应。
Proc Natl Acad Sci U S A. 2013 Sep 17;110(38):E3660-9. doi: 10.1073/pnas.1305373110. Epub 2013 Aug 28.
9
The effects of neural gain on attention and learning.神经增益对注意力和学习的影响。
Nat Neurosci. 2013 Aug;16(8):1146-53. doi: 10.1038/nn.3428. Epub 2013 Jun 16.
10
Confidence in value-based choice.基于价值的选择的信心。
Nat Neurosci. 2013 Jan;16(1):105-10. doi: 10.1038/nn.3279. Epub 2012 Dec 9.