• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计算模型参数的解释取决于上下文。

The interpretation of computational model parameters depends on the context.

机构信息

Department of Psychology, University of California, Berkeley, Berkeley, United States.

Department of Psychology, New York University, New York, United States.

出版信息

Elife. 2022 Nov 4;11:e75474. doi: 10.7554/eLife.75474.

DOI:10.7554/eLife.75474
PMID:36331872
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9635876/
Abstract

Reinforcement Learning (RL) models have revolutionized the cognitive and brain sciences, promising to explain behavior from simple conditioning to complex problem solving, to shed light on developmental and individual differences, and to anchor cognitive processes in specific brain mechanisms. However, the RL literature increasingly reveals contradictory results, which might cast doubt on these claims. We hypothesized that many contradictions arise from two commonly-held assumptions about computational model parameters that are actually often invalid: That parameters between contexts (e.g. tasks, models) and that they capture (i.e. unique, distinctive) neurocognitive processes. To test this, we asked 291 participants aged 8-30 years to complete three learning tasks in one experimental session, and fitted RL models to each. We found that some parameters (exploration / decision noise) showed significant generalization: they followed similar developmental trajectories, and were reciprocally predictive between tasks. Still, generalization was significantly below the methodological ceiling. Furthermore, other parameters (learning rates, forgetting) did not show evidence of generalization, and sometimes even opposite developmental trajectories. Interpretability was low for all parameters. We conclude that the systematic study of context factors (e.g. reward stochasticity; task volatility) will be necessary to enhance the generalizability and interpretability of computational cognitive models.

摘要

强化学习(RL)模型彻底改变了认知和脑科学领域,有望从简单的条件作用到复杂的问题解决,从发展和个体差异,到将认知过程锚定在特定的大脑机制,来解释行为。然而,RL 文献越来越多地揭示出相互矛盾的结果,这可能使这些主张受到质疑。我们假设,许多矛盾源于对计算模型参数的两个常见假设,而这些假设实际上往往是无效的:参数在不同情境(例如任务、模型)之间是不变的,并且它们捕捉到独特的神经认知过程。为了验证这一点,我们要求 291 名年龄在 8 至 30 岁之间的参与者在一次实验中完成三个学习任务,并为每个任务拟合 RL 模型。我们发现,一些参数(探索/决策噪声)表现出显著的泛化:它们遵循相似的发展轨迹,并且在任务之间具有相互预测性。尽管如此,泛化程度仍明显低于方法学上限。此外,其他参数(学习率、遗忘)没有表现出泛化的迹象,有时甚至呈现相反的发展轨迹。所有参数的可解释性都很低。我们的结论是,系统地研究上下文因素(例如奖励随机性;任务波动性)将是增强计算认知模型的可泛化性和可解释性所必需的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d31c5c735f7b/elife-75474-sa2-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/17fa696daede/elife-75474-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/b43c3760dc56/elife-75474-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/77f5633b7bdc/elife-75474-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/5617e1e35247/elife-75474-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/bf5a53e9f1b9/elife-75474-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/ca10674fadb1/elife-75474-app3-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/96a3e989a69d/elife-75474-app4-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/6397553ec66e/elife-75474-app5-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/e554e98c4c00/elife-75474-app8-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/c0fb89b1acd4/elife-75474-app8-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d17a1da34477/elife-75474-app8-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/112fafd22635/elife-75474-app8-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/6c0f59bebe0c/elife-75474-sa2-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d8a8b97a5a8f/elife-75474-sa2-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/95f394e65726/elife-75474-sa2-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d31c5c735f7b/elife-75474-sa2-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/17fa696daede/elife-75474-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/b43c3760dc56/elife-75474-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/77f5633b7bdc/elife-75474-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/5617e1e35247/elife-75474-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/bf5a53e9f1b9/elife-75474-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/ca10674fadb1/elife-75474-app3-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/96a3e989a69d/elife-75474-app4-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/6397553ec66e/elife-75474-app5-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/e554e98c4c00/elife-75474-app8-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/c0fb89b1acd4/elife-75474-app8-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d17a1da34477/elife-75474-app8-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/112fafd22635/elife-75474-app8-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/6c0f59bebe0c/elife-75474-sa2-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d8a8b97a5a8f/elife-75474-sa2-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/95f394e65726/elife-75474-sa2-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3409/9635876/d31c5c735f7b/elife-75474-sa2-fig4.jpg

相似文献

1
The interpretation of computational model parameters depends on the context.计算模型参数的解释取决于上下文。
Elife. 2022 Nov 4;11:e75474. doi: 10.7554/eLife.75474.
2
What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience.强化学习模型衡量的是什么?解读认知与神经科学中的模型参数。
Curr Opin Behav Sci. 2021 Oct;41:128-137. doi: 10.1016/j.cobeha.2021.06.004. Epub 2021 Jul 3.
3
Multiple memory systems as substrates for multiple decision systems.多种记忆系统作为多种决策系统的基础。
Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.
4
Generalization of value in reinforcement learning by humans.人类在强化学习中的价值泛化。
Eur J Neurosci. 2012 Apr;35(7):1092-104. doi: 10.1111/j.1460-9568.2012.08017.x.
5
Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T.状态和动作关联或区分泛化的强化学习:3T 和 7T 的 fMRI。
Hum Brain Mapp. 2022 Oct 15;43(15):4750-4790. doi: 10.1002/hbm.25988. Epub 2022 Jul 21.
6
Learning and forgetting using reinforced Bayesian change detection.基于强化贝叶斯变化检测的学习和遗忘。
PLoS Comput Biol. 2019 Apr 17;15(4):e1006713. doi: 10.1371/journal.pcbi.1006713. eCollection 2019 Apr.
7
Differential effects of reward and punishment on reinforcement-based motor learning and generalization.奖励和惩罚对基于强化的运动学习和泛化的差异影响。
J Neurophysiol. 2023 Nov 1;130(5):1150-1161. doi: 10.1152/jn.00242.2023. Epub 2023 Oct 4.
8
Reward Learning as a Potential Mechanism for Improvement in Schizophrenia Spectrum Disorders Following Cognitive Remediation: Protocol for a Clinical, Nonrandomized, Pre-Post Pilot Study.奖励学习作为认知矫正后精神分裂症谱系障碍改善的潜在机制:一项临床、非随机、前后对照预试验研究方案
JMIR Res Protoc. 2024 Jan 22;13:e52505. doi: 10.2196/52505.
9
A probabilistic successor representation for context-dependent learning.一种用于上下文相关学习的概率后继表示。
Psychol Rev. 2024 Mar;131(2):578-597. doi: 10.1037/rev0000414. Epub 2023 May 11.
10
Integrating unsupervised and reinforcement learning in human categorical perception: A computational model.无监督学习和强化学习在人类范畴感知中的整合:一个计算模型。
PLoS One. 2022 May 10;17(5):e0267838. doi: 10.1371/journal.pone.0267838. eCollection 2022.

引用本文的文献

1
The joint estimation of uncertainty and its relationship with psychotic-like traits and psychometric schizotypy.不确定性的联合估计及其与类精神病性特质和心理测量学精神分裂症型人格的关系。
Npj Ment Health Res. 2025 Aug 31;4(1):40. doi: 10.1038/s44184-025-00146-6.
2
Investigating the potential psychological significance of the alpha parameter in the Lévy flight model of decision making: A reliability analysis approach.探究决策的 Lévy 飞行模型中阿尔法参数的潜在心理意义:一种可靠性分析方法。
Behav Res Methods. 2025 Aug 26;57(10):269. doi: 10.3758/s13428-025-02784-2.
3
Differential Associations of Dopamine and Serotonin With Reward and Punishment Processes in Humans: A Systematic Review and Meta-Analysis.

本文引用的文献

1
A role for adaptive developmental plasticity in learning and decision making.适应性发育可塑性在学习和决策中的作用。
Curr Opin Behav Sci. 2020 Dec;36:48-54. doi: 10.1016/j.cobeha.2020.07.010. Epub 2020 Aug 23.
2
Transient food insecurity during the juvenile-adolescent period affects adult weight, cognitive flexibility, and dopamine neurobiology.青少年时期短暂的食物不安全感会影响成年后的体重、认知灵活性和多巴胺神经生物学。
Curr Biol. 2022 Sep 12;32(17):3690-3703.e5. doi: 10.1016/j.cub.2022.06.089. Epub 2022 Jul 20.
3
Reinforcement learning and Bayesian inference provide complementary models for the unique advantage of adolescents in stochastic reversal.
多巴胺和血清素与人类奖励和惩罚过程的差异关联:一项系统综述和荟萃分析。
JAMA Psychiatry. 2025 Jun 11. doi: 10.1001/jamapsychiatry.2025.0839.
4
Reinforcement learning increasingly relates to memory specificity from childhood to adulthood.从童年到成年,强化学习与记忆特异性的关联日益紧密。
Nat Commun. 2025 Apr 30;16(1):4074. doi: 10.1038/s41467-025-59379-w.
5
Computational Perspectives on Cognition in Anorexia Nervosa: A Systematic Review.神经性厌食症认知的计算视角:一项系统综述
Comput Psychiatr. 2025 Apr 7;9(1):100-121. doi: 10.5334/cpsy.128. eCollection 2025.
6
Eating disorder symptoms and emotional arousal modulate food biases during reward learning in females.饮食失调症状和情绪唤起在女性奖励学习过程中调节食物偏好。
Nat Commun. 2025 Mar 26;16(1):2938. doi: 10.1038/s41467-025-57872-w.
7
Genetic changes linked to two different syndromic forms of autism enhance reinforcement learning in adolescent male but not female mice.与两种不同综合征形式自闭症相关的基因变化增强了青春期雄性小鼠而非雌性小鼠的强化学习能力。
bioRxiv. 2025 Jan 15:2025.01.15.633099. doi: 10.1101/2025.01.15.633099.
8
Interpretation of individual differences in computational neuroscience using a latent input approach.使用潜在输入方法对计算神经科学中的个体差异进行解释。
Dev Cogn Neurosci. 2025 Apr;72:101512. doi: 10.1016/j.dcn.2025.101512. Epub 2025 Jan 16.
9
Rewards transiently and automatically enhance sustained attention.奖励会短暂且自动地增强持续注意力。
J Exp Psychol Gen. 2025 Apr;154(4):1063-1079. doi: 10.1037/xge0001727. Epub 2025 Jan 20.
10
A multiverse assessment of the reliability of the self-matching task as a measurement of the self-prioritization effect.作为自我优先效应测量手段的自我匹配任务可靠性的多宇宙评估。
Behav Res Methods. 2025 Jan 2;57(1):37. doi: 10.3758/s13428-024-02538-6.
强化学习和贝叶斯推断为青少年在随机反转中的独特优势提供了互补的模型。
Dev Cogn Neurosci. 2022 Jun;55:101106. doi: 10.1016/j.dcn.2022.101106. Epub 2022 Apr 22.
4
Sufficient reliability of the behavioral and computational readouts of a probabilistic reversal learning task.概率反转学习任务的行为和计算读数具有足够的可靠性。
Behav Res Methods. 2022 Dec;54(6):2993-3014. doi: 10.3758/s13428-021-01739-7. Epub 2022 Feb 15.
5
Valence biases in reinforcement learning shift across adolescence and modulate subsequent memory.强化学习中的效价偏差在青春期发生转变,并调节随后的记忆。
Elife. 2022 Jan 24;11:e64620. doi: 10.7554/eLife.64620.
6
What do Reinforcement Learning Models Measure? Interpreting Model Parameters in Cognition and Neuroscience.强化学习模型衡量的是什么?解读认知与神经科学中的模型参数。
Curr Opin Behav Sci. 2021 Oct;41:128-137. doi: 10.1016/j.cobeha.2021.06.004. Epub 2021 Jul 3.
7
Modeling changes in probabilistic reinforcement learning during adolescence.建模青少年时期概率强化学习的变化。
PLoS Comput Biol. 2021 Jul 1;17(7):e1008524. doi: 10.1371/journal.pcbi.1008524. eCollection 2021 Jul.
8
Decision-making ability, psychopathology, and brain connectivity.决策能力、精神病理学和大脑连接。
Neuron. 2021 Jun 16;109(12):2025-2040.e7. doi: 10.1016/j.neuron.2021.04.019. Epub 2021 May 20.
9
Reliability and Replicability of Implicit and Explicit Reinforcement Learning Paradigms in People With Psychotic Disorders.精神分裂症患者内隐和外显强化学习范式的可靠性和可重复性。
Schizophr Bull. 2021 Apr 29;47(3):731-739. doi: 10.1093/schbul/sbaa165.
10
Dissociation between asymmetric value updating and perseverance in human reinforcement learning.人类强化学习中不对称价值更新与坚持之间的分离。
Sci Rep. 2021 Feb 11;11(1):3574. doi: 10.1038/s41598-020-80593-7.