• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于两阶段任务结果分析的说明:任务结构的变化如何影响无模型和基于模型的策略对奖励和转换对停留概率的影响的预测。

A note on the analysis of two-stage task results: How changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probability.

机构信息

Laboratory for Social and Neural Systems Research, Department of Economics, University of Zurich, Zurich, Switzerland.

Zurich Center for Neuroscience, University of Zurich and ETH, Zurich, Switzerland.

出版信息

PLoS One. 2018 Apr 3;13(4):e0195328. doi: 10.1371/journal.pone.0195328. eCollection 2018.

DOI:10.1371/journal.pone.0195328
PMID:29614130
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5882146/
Abstract

Many studies that aim to detect model-free and model-based influences on behavior employ two-stage behavioral tasks of the type pioneered by Daw and colleagues in 2011. Such studies commonly modify existing two-stage decision paradigms in order to better address a given hypothesis, which is an important means of scientific progress. It is, however, critical to fully appreciate the impact of any modified or novel experimental design features on the expected results. Here, we use two concrete examples to demonstrate that relatively small changes in the two-stage task design can substantially change the pattern of actions taken by model-free and model-based agents as a function of the reward outcomes and transitions on previous trials. In the first, we show that, under specific conditions, purely model-free agents will produce the reward by transition interactions typically thought to characterize model-based behavior on a two-stage task. The second example shows that model-based agents' behavior is driven by a main effect of transition-type in addition to the canonical reward by transition interaction whenever the reward probabilities of the final states do not sum to one. Together, these examples emphasize the task-dependence of model-free and model-based behavior and highlight the benefits of using computer simulations to determine what pattern of results to expect from both model-free and model-based agents performing a given two-stage decision task in order to design choice paradigms and analysis strategies best suited to the current question.

摘要

许多旨在检测行为的无模型和基于模型影响的研究都采用了由 Daw 及其同事在 2011 年开创的两阶段行为任务类型。此类研究通常会修改现有的两阶段决策范式,以便更好地解决给定的假设,这是科学进步的重要手段。然而,充分了解任何修改或新颖的实验设计特征对预期结果的影响是至关重要的。在这里,我们使用两个具体的例子来说明,两阶段任务设计中的相对较小的变化可以极大地改变无模型和基于模型的代理在先前试验的奖励结果和转变的函数下所采取的行为模式。在第一个例子中,我们表明,在特定条件下,纯粹的无模型代理将通过通常被认为是两阶段任务中基于模型的行为的转变交互来产生奖励。第二个例子表明,只要最终状态的奖励概率不总和为一,基于模型的代理的行为就会受到转变类型的主要效应的驱动,除了经典的转变奖励交互作用之外。总之,这些例子强调了无模型和基于模型行为的任务依赖性,并强调了使用计算机模拟来确定无模型和基于模型代理在执行给定的两阶段决策任务时的预期结果模式的好处,以便设计最适合当前问题的选择范式和分析策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/b9a9663856a7/pone.0195328.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/53b6cbd611e1/pone.0195328.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/8a7292ad7ce0/pone.0195328.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/b9a9663856a7/pone.0195328.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/53b6cbd611e1/pone.0195328.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/8a7292ad7ce0/pone.0195328.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b41/5882146/b9a9663856a7/pone.0195328.g003.jpg

相似文献

1
A note on the analysis of two-stage task results: How changes in task structure affect what model-free and model-based strategies predict about the effects of reward and transition on the stay probability.关于两阶段任务结果分析的说明:任务结构的变化如何影响无模型和基于模型的策略对奖励和转换对停留概率的影响的预测。
PLoS One. 2018 Apr 3;13(4):e0195328. doi: 10.1371/journal.pone.0195328. eCollection 2018.
2
A simple computational algorithm of model-based choice preference.一种基于模型的选择偏好的简单计算算法。
Cogn Affect Behav Neurosci. 2017 Aug;17(4):764-783. doi: 10.3758/s13415-017-0511-2.
3
Hold it! The influence of lingering rewards on choice diversification and persistence.等等!延迟奖励对选择多样化和坚持性的影响。
J Exp Psychol Learn Mem Cogn. 2017 Nov;43(11):1752-1767. doi: 10.1037/xlm0000407. Epub 2017 Apr 6.
4
Model-based reinforcement learning under concurrent schedules of reinforcement in rodents.啮齿动物在并发强化程序下基于模型的强化学习
Learn Mem. 2009 Apr 29;16(5):315-23. doi: 10.1101/lm.1295509. Print 2009 May.
5
Reward-dependent learning in neuronal networks for planning and decision making.用于规划和决策的神经网络中基于奖励的学习。
Prog Brain Res. 2000;126:217-29. doi: 10.1016/S0079-6123(00)26016-0.
6
Reinforcement learning and decision making in monkeys during a competitive game.猴子在竞争性游戏中的强化学习与决策
Brain Res Cogn Brain Res. 2004 Dec;22(1):45-58. doi: 10.1016/j.cogbrainres.2004.07.007.
7
Rapid decision threshold modulation by reward rate in a neural network.神经网络中奖励率对快速决策阈值的调制
Neural Netw. 2006 Oct;19(8):1013-26. doi: 10.1016/j.neunet.2006.05.038. Epub 2006 Sep 20.
8
The Feedback-related Negativity Codes Components of Abstract Inference during Reward-based Decision-making.基于奖励的决策过程中,反馈相关负波编码抽象推理的组成部分。
J Cogn Neurosci. 2016 Aug;28(8):1127-38. doi: 10.1162/jocn_a_00957. Epub 2016 Mar 31.
9
Humans primarily use model-based inference in the two-stage task.人类主要在两阶段任务中使用基于模型的推理。
Nat Hum Behav. 2020 Oct;4(10):1053-1066. doi: 10.1038/s41562-020-0905-y. Epub 2020 Jul 6.
10
Reward-based decision making in pathological gambling: the roles of risk and delay.病态赌博中基于奖励的决策:风险与延迟的作用。
Neurosci Res. 2015 Jan;90:3-14. doi: 10.1016/j.neures.2014.09.008. Epub 2014 Sep 28.

引用本文的文献

1
Striatal arbitration between choice strategies guides few-shot adaptation.选择策略之间的纹状体仲裁引导少样本适应。
Nat Commun. 2025 Feb 20;16(1):1811. doi: 10.1038/s41467-025-57049-5.
2
Signatures of Perseveration and Heuristic-Based Directed Exploration in Two-Step Sequential Decision Task Behaviour.两步序贯决策任务行为中持续重复和基于启发式的定向探索特征
Comput Psychiatr. 2025 Feb 11;9(1):39-62. doi: 10.5334/cpsy.101. eCollection 2025.
3
Negative affect-driven impulsivity as hierarchical model-based overgeneralization.基于层次模型的过度泛化的消极情感驱动冲动性。

本文引用的文献

1
Cost-Benefit Arbitration Between Multiple Reinforcement-Learning Systems.多强化学习系统的成本效益仲裁。
Psychol Sci. 2017 Sep;28(9):1321-1333. doi: 10.1177/0956797617708288. Epub 2017 Jul 21.
2
Distinct cortico-striatal connections with subthalamic nucleus underlie facets of compulsivity.与底丘脑核不同的皮质-纹状体连接是强迫行为各方面的基础。
Cortex. 2017 Mar;88:143-150. doi: 10.1016/j.cortex.2016.12.018. Epub 2016 Dec 29.
3
When Does Model-Based Control Pay Off?基于模型的控制何时能带来回报?
Trends Cogn Sci. 2025 May;29(5):407-420. doi: 10.1016/j.tics.2025.01.002. Epub 2025 Feb 6.
4
Using smartphones to optimise and scale-up the assessment of model-based planning.利用智能手机优化并扩大基于模型规划的评估。
Commun Psychol. 2023 Nov 1;1(1):31. doi: 10.1038/s44271-023-00031-y.
5
The effect of body image dissatisfaction on goal-directed decision making in a population marked by negative appearance beliefs and disordered eating.身体意象不满对具有负面外貌信念和饮食失调人群的目标导向决策的影响。
PLoS One. 2022 Nov 28;17(11):e0276750. doi: 10.1371/journal.pone.0276750. eCollection 2022.
6
Active inference and the two-step task.主动推断与两步任务。
Sci Rep. 2022 Oct 21;12(1):17682. doi: 10.1038/s41598-022-21766-4.
7
No substantial change in the balance between model-free and model-based control via training on the two-step task.通过两步任务训练,在无模型和基于模型控制之间的平衡没有实质性变化。
PLoS Comput Biol. 2019 Nov 14;15(11):e1007443. doi: 10.1371/journal.pcbi.1007443. eCollection 2019 Nov.
8
Improving the reliability of model-based decision-making estimates in the two-stage decision task with reaction-times and drift-diffusion modeling.使用反应时和漂移扩散建模改进两阶段决策任务中基于模型的决策估计的可靠性。
PLoS Comput Biol. 2019 Feb 13;15(2):e1006803. doi: 10.1371/journal.pcbi.1006803. eCollection 2019 Feb.
PLoS Comput Biol. 2016 Aug 26;12(8):e1005090. doi: 10.1371/journal.pcbi.1005090. eCollection 2016 Aug.
4
Habitual control of goal selection in humans.人类目标选择的习惯性控制。
Proc Natl Acad Sci U S A. 2015 Nov 10;112(45):13817-22. doi: 10.1073/pnas.1506367112. Epub 2015 Oct 12.
5
Model-based learning protects against forming habits.基于模型的学习可防止形成习惯。
Cogn Affect Behav Neurosci. 2015 Sep;15(3):523-36. doi: 10.3758/s13415-015-0347-6.
6
Model-based choices involve prospective neural activity.基于模型的选择涉及前瞻性神经活动。
Nat Neurosci. 2015 May;18(5):767-72. doi: 10.1038/nn.3981. Epub 2015 Mar 23.
7
Ventral striatal dopamine reflects behavioral and neural signatures of model-based control during sequential decision making.腹侧纹状体多巴胺反映了序列决策过程中基于模型控制的行为和神经特征。
Proc Natl Acad Sci U S A. 2015 Feb 3;112(5):1595-600. doi: 10.1073/pnas.1417219112. Epub 2015 Jan 20.
8
Model-based and model-free decisions in alcohol dependence.酒精依赖中基于模型和无模型的决策
Neuropsychobiology. 2014;70(2):122-31. doi: 10.1159/000362840. Epub 2014 Oct 30.
9
Cognitive control predicts use of model-based reinforcement learning.认知控制可预测基于模型的强化学习的使用情况。
J Cogn Neurosci. 2015 Feb;27(2):319-33. doi: 10.1162/jocn_a_00709.
10
Disorders of compulsivity: a common bias towards learning habits.强迫性障碍:对学习习惯的一种常见偏好。
Mol Psychiatry. 2015 Mar;20(3):345-52. doi: 10.1038/mp.2014.44. Epub 2014 May 20.