Suppr超能文献

基于模型和无模型的疼痛回避学习。

Model-based and model-free pain avoidance learning.

作者信息

Wang Oliver, Lee Sang Wan, O'Doherty John, Seymour Ben, Yoshida Wako

机构信息

Department of Neural Computation for Decision-making, Advanced Telecommunications Research Institute International, Kyoto, Japan.

Department of Biology, Stanford University, Stanford, CA, USA.

出版信息

Brain Neurosci Adv. 2018 May 5;2:2398212818772964. doi: 10.1177/2398212818772964. eCollection 2018.

Abstract

While there is good evidence that reward learning is underpinned by two distinct decision control systems - a cognitive 'model-based' and a habitbased 'model-free' system, a comparable distinction for punishment avoidance has been much less clear. We implemented a pain avoidance task that placed differential emphasis on putative model-based and model-free processing, mirroring a paradigm and modelling approach recently developed for reward-based decision-making. Subjects performed a two-step decision-making task with probabilistic pain outcomes of different quantities. The delivery of outcomes was sometimes contingent on a rule signalled at the beginning of each trial, emulating a form of outcome devaluation. The behavioural data showed that subjects tended to use a mixed strategy - favouring the simpler model-free learning strategy when outcomes did not depend on the rule, and favouring a model-based when they did. Furthermore, the data were well described by a dynamic transition model between the two controllers. When compared with data from a reward-based task (albeit tested in the context of the scanner), we observed that avoidance involved a significantly greater tendency for subjects to switch between model-free and model-based systems in the face of changes in uncertainty. Our study suggests a dual-system model of pain avoidance, similar to but possibly more dynamically flexible than reward-based decision-making.

摘要

虽然有充分证据表明奖励学习由两个不同的决策控制系统支撑——一个是认知性的“基于模型”系统,另一个是基于习惯的“无模型”系统,但惩罚规避方面类似的区分却远没有那么清晰。我们实施了一项疼痛规避任务,该任务对假定的基于模型和无模型处理给予了不同程度的强调,这类似于最近为基于奖励的决策制定所开发的一种范式和建模方法。受试者执行了一项两步决策任务,其具有不同数量的概率性疼痛结果。结果的呈现有时取决于每次试验开始时发出信号的规则,模拟了一种结果贬值的形式。行为数据表明,受试者倾向于使用一种混合策略——当结果不依赖于规则时,倾向于使用更简单的无模型学习策略;当结果依赖于规则时,则倾向于使用基于模型的策略。此外,数据可以通过两个控制器之间的动态转换模型得到很好的描述。与来自基于奖励任务的数据(尽管是在扫描仪环境中进行测试)相比,我们观察到,在面对不确定性变化时,规避行为中受试者在无模型和基于模型系统之间切换的倾向明显更大。我们的研究提出了一种疼痛规避的双系统模型,它与基于奖励的决策制定类似,但可能在动态灵活性方面更强。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f098/7058257/dd58c83000ac/10.1177_2398212818772964-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验