Suppr超能文献

强化学习与赢则留输则变决策过程的比较模型:向W.K. 埃斯蒂斯致敬。

A Comparison Model of Reinforcement-Learning and Win-Stay-Lose-Shift Decision-Making Processes: A Tribute to W.K. Estes.

作者信息

Worthy Darrell A, Maddox W Todd

机构信息

Texas A&M University.

The University of Texas at Austin.

出版信息

J Math Psychol. 2014 Apr 1;59:41-49. doi: 10.1016/j.jmp.2013.10.001.

Abstract

W.K. Estes often championed an approach to model development whereby an existing model was augmented by the addition of one or more free parameters, and a comparison between the simple and more complex, augmented model determined whether the additions were justified. Following this same approach we utilized Estes' (1950) own augmented learning equations to improve the fit and plausibility of a win-stay-lose-shift (WSLS) model that we have used in much of our recent work. Estes also championed models that assumed a comparison between multiple concurrent cognitive processes. In line with this, we develop a WSLS-Reinforcement Learning (RL) model that assumes that the output of a WSLS process that provides a probability of staying or switching to a different option based on the last two decision outcomes is compared with the output of an RL process that determines a probability of selecting each option based on a comparison of the expected value of each option. Fits to data from three different decision-making experiments suggest that the augmentations to the WSLS and RL models lead to a better account of decision-making behavior. Our results also support the assertion that human participants weigh the output of WSLS and RL processes during decision-making.

摘要

W.K. 埃斯蒂斯经常倡导一种模型开发方法,即通过添加一个或多个自由参数来增强现有模型,然后比较简单模型和更复杂的增强模型,以确定添加参数是否合理。遵循同样的方法,我们利用埃斯蒂斯(1950年)自己的增强学习方程来提高我们在近期许多工作中使用的赢留输变(WSLS)模型的拟合度和合理性。埃斯蒂斯还支持那些假设多个并发认知过程之间存在比较的模型。与此一致,我们开发了一个WSLS强化学习(RL)模型,该模型假设基于最后两个决策结果提供停留或切换到不同选项概率的WSLS过程的输出,与基于每个选项预期值比较来确定选择每个选项概率的RL过程的输出进行比较。对来自三个不同决策实验数据的拟合表明,对WSLS和RL模型的增强能更好地解释决策行为。我们的结果也支持这样一种观点,即人类参与者在决策过程中会权衡WSLS和RL过程的输出。

相似文献

5
Age-based differences in strategy use in choice tasks.选择任务中策略使用的年龄差异。
Front Neurosci. 2012 Jan 6;5:145. doi: 10.3389/fnins.2011.00145. eCollection 2012.
7
Working-memory load and temporal myopia in dynamic decision making.工作记忆负荷与动态决策中的时间近视。
J Exp Psychol Learn Mem Cogn. 2012 Nov;38(6):1640-58. doi: 10.1037/a0028146. Epub 2012 Apr 30.
9
Reward-driven decision-making impairments in schizophrenia.精神分裂症患者的奖赏驱动决策障碍。
Schizophr Res. 2019 Apr;206:277-283. doi: 10.1016/j.schres.2018.11.004. Epub 2018 Nov 12.

引用本文的文献

7
A guide to area-restricted search: a foundational foraging behaviour.区域限制搜索指南:一种基础性的觅食行为。
Biol Rev Camb Philos Soc. 2022 Dec;97(6):2076-2089. doi: 10.1111/brv.12883. Epub 2022 Jul 12.

本文引用的文献

2
Working-memory load and temporal myopia in dynamic decision making.工作记忆负荷与动态决策中的时间近视。
J Exp Psychol Learn Mem Cogn. 2012 Nov;38(6):1640-58. doi: 10.1037/a0028146. Epub 2012 Apr 30.
3
Age-based differences in strategy use in choice tasks.选择任务中策略使用的年龄差异。
Front Neurosci. 2012 Jan 6;5:145. doi: 10.3389/fnins.2011.00145. eCollection 2012.
4
With age comes wisdom: decision making in younger and older adults.年龄带来智慧:年轻人和老年人的决策制定。
Psychol Sci. 2011 Nov;22(11):1375-80. doi: 10.1177/0956797611420301. Epub 2011 Sep 29.
9
Regulatory fit effects in a choice task.选择任务中的调节匹配效应。
Psychon Bull Rev. 2007 Dec;14(6):1125-32. doi: 10.3758/bf03193101.
10
Short-term memory traces for action bias in human reinforcement learning.人类强化学习中动作偏差的短期记忆痕迹
Brain Res. 2007 Jun 11;1153:111-21. doi: 10.1016/j.brainres.2007.03.057. Epub 2007 Mar 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验