• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类行为中后继表征的逐次试验学习。

Trial-by-trial learning of successor representations in human behavior.

作者信息

Kahn Ari E, Bassett Dani S, Daw Nathaniel D

机构信息

Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.

Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

bioRxiv. 2025 Jun 16:2024.11.07.622528. doi: 10.1101/2024.11.07.622528.

DOI:10.1101/2024.11.07.622528
PMID:40667003
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12262301/
Abstract

Decisions in humans and other organisms depend, in part, on learning and using models that capture the statistical structure of the world, including the long-run expected outcomes of our actions. One prominent approach to forecasting such long-run outcomes is the successor representation (SR), which predicts future states aggregated over multiple timesteps. Although much behavioral and neural evidence suggests that people and animals use such a representation, it remains unknown how they acquire it. It has frequently been assumed to be learned by temporal difference bootstrapping (SR-TD(0)), but this assumption has largely not been empirically tested or compared to alternatives including eligibility traces (SR-TD( ). Here we address this gap by leveraging trial-by-trial reaction times in graph sequence learning tasks, which are favorable for studying learning dynamics because the long horizons in these studies differentiate the transient update dynamics of different learning rules. We examined the behavior of SR-TD on a probabilistic graph learning task alongside a number of alternatives, and found that behavior was best explained by a hybrid model which learned via SR-TD alongside an additional predictive model of recency. The relatively large we estimate indicates a predominant role of eligibility trace mechanisms over the bootstrap-based chaining typically assumed. Our results provide insight into how humans learn predictive representations, and demonstrate that people simultaneously learn the SR alongside lower-order predictions.

摘要

人类和其他生物体的决策部分取决于学习和使用能够捕捉世界统计结构的模型,包括我们行动的长期预期结果。预测此类长期结果的一种突出方法是后继表示(SR),它预测多个时间步长上聚合的未来状态。尽管大量行为和神经证据表明人和动物使用这种表示,但他们如何获得它仍然未知。人们经常假设它是通过时间差分自展(SR-TD(0))学习的,但这一假设在很大程度上尚未经过实证检验,也未与包括资格迹线(SR-TD( ))在内的其他方法进行比较。在这里,我们通过利用图序列学习任务中的逐次试验反应时间来填补这一空白,这些任务有利于研究学习动态,因为这些研究中的长时程区分了不同学习规则的瞬态更新动态。我们在概率图学习任务中研究了SR-TD( )与其他一些方法的行为,发现行为最好由一个混合模型解释,该模型通过SR-TD( )以及一个额外的近期预测模型进行学习。我们估计的相对较大的( )表明资格迹线机制比通常假设的基于自展的链式机制起主要作用。我们的结果为人类如何学习预测性表示提供了见解,并表明人们在学习低阶预测的同时也学习了SR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/88d18639bbfd/nihpp-2024.11.07.622528v3-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/e3ea92faefad/nihpp-2024.11.07.622528v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/26335cad6a29/nihpp-2024.11.07.622528v3-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/db6027fa8bc7/nihpp-2024.11.07.622528v3-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/88d18639bbfd/nihpp-2024.11.07.622528v3-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/e3ea92faefad/nihpp-2024.11.07.622528v3-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/26335cad6a29/nihpp-2024.11.07.622528v3-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/db6027fa8bc7/nihpp-2024.11.07.622528v3-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/112d/12262301/88d18639bbfd/nihpp-2024.11.07.622528v3-f0004.jpg

相似文献

1
Trial-by-trial learning of successor representations in human behavior.人类行为中后继表征的逐次试验学习。
bioRxiv. 2025 Jun 16:2024.11.07.622528. doi: 10.1101/2024.11.07.622528.
2
Short-Term Memory Impairment短期记忆障碍
3
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
Fabricating mice and dementia: opening up relations in multi-species research制造小鼠与痴呆症:开启多物种研究中的关联
6
"In a State of Flow": A Qualitative Examination of Autistic Adults' Phenomenological Experiences of Task Immersion.“心流状态”:对自闭症成年人任务沉浸现象学体验的质性研究
Autism Adulthood. 2024 Sep 16;6(3):362-373. doi: 10.1089/aut.2023.0032. eCollection 2024 Sep.
7
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
8
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理:一项网络荟萃分析。
Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.
9
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
10
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

本文引用的文献

1
Episodic retrieval for model-based evaluation in sequential decision tasks.在序列决策任务中基于模型评估的情景检索
Psychol Rev. 2025 Jan;132(1):18-49. doi: 10.1037/rev0000505. Epub 2024 Dec 30.
2
Network structure influences the strength of learned neural representations.网络结构影响学习到的神经表征的强度。
Nat Commun. 2025 Jan 24;16(1):994. doi: 10.1038/s41467-024-55459-5.
3
Cortical Areas for Planning Sequences before and during Movement.运动前及运动过程中用于计划动作序列的皮质区域。
J Neurosci. 2025 Jan 15;45(3):e1300242024. doi: 10.1523/JNEUROSCI.1300-24.2024.
4
Human hippocampal and entorhinal neurons encode the temporal structure of experience.人类海马体和内嗅皮层神经元对经验的时间结构进行编码。
Nature. 2024 Nov;635(8037):160-167. doi: 10.1038/s41586-024-07973-1. Epub 2024 Sep 25.
5
Humans adaptively deploy forward and backward prediction.人类适应性地部署前向和后向预测。
Nat Hum Behav. 2024 Sep;8(9):1726-1737. doi: 10.1038/s41562-024-01930-8. Epub 2024 Jul 16.
6
A feature-specific prediction error model explains dopaminergic heterogeneity.一种具有特征特异性的预测误差模型解释了多巴胺能异质性。
Nat Neurosci. 2024 Aug;27(8):1574-1586. doi: 10.1038/s41593-024-01689-1. Epub 2024 Jul 3.
7
Heuristics in risky decision-making relate to preferential representation of information.启发式在风险决策中与信息的优先表示有关。
Nat Commun. 2024 May 20;15(1):4269. doi: 10.1038/s41467-024-48547-z.
8
The successor representation subserves hierarchical abstraction for goal-directed behavior.后继表示服务于目标导向行为的层次抽象。
PLoS Comput Biol. 2024 Feb 20;20(2):e1011312. doi: 10.1371/journal.pcbi.1011312. eCollection 2024 Feb.
9
Dopamine transients follow a striatal gradient of reward time horizons.多巴胺瞬变遵循纹状体奖赏时程的梯度。
Nat Neurosci. 2024 Apr;27(4):737-746. doi: 10.1038/s41593-023-01566-3. Epub 2024 Feb 6.
10
Dual credit assignment processes underlie dopamine signals in a complex spatial environment.双信用分配过程是复杂空间环境中多巴胺信号的基础。
Neuron. 2023 Nov 1;111(21):3465-3478.e7. doi: 10.1016/j.neuron.2023.07.017. Epub 2023 Aug 22.