• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估经典条件作用的TD模型。

Evaluating the TD model of classical conditioning.

作者信息

Ludvig Elliot A, Sutton Richard S, Kehoe E James

机构信息

Princeton Neuroscience Institute and Department of Mechanical & Aerospace Engineering, Princeton University, 3-N-12 Green Hall, Princeton, NJ 08542, USA.

出版信息

Learn Behav. 2012 Sep;40(3):305-19. doi: 10.3758/s13420-012-0082-6.

DOI:10.3758/s13420-012-0082-6
PMID:22927003
Abstract

The temporal-difference (TD) algorithm from reinforcement learning provides a simple method for incrementally learning predictions of upcoming events. Applied to classical conditioning, TD models suppose that animals learn a real-time prediction of the unconditioned stimulus (US) on the basis of all available conditioned stimuli (CSs). In the TD model, similar to other error-correction models, learning is driven by prediction errors--the difference between the change in US prediction and the actual US. With the TD model, however, learning occurs continuously from moment to moment and is not artificially constrained to occur in trials. Accordingly, a key feature of any TD model is the assumption about the representation of a CS on a moment-to-moment basis. Here, we evaluate the performance of the TD model with a heretofore unexplored range of classical conditioning tasks. To do so, we consider three stimulus representations that vary in their degree of temporal generalization and evaluate how the representation influences the performance of the TD model on these conditioning tasks.

摘要

强化学习中的时间差分(TD)算法提供了一种简单的方法来逐步学习对即将发生事件的预测。应用于经典条件作用时,TD模型假设动物基于所有可用的条件刺激(CS)对无条件刺激(US)进行实时预测。在TD模型中,与其他误差校正模型类似,学习由预测误差驱动——即US预测的变化与实际US之间的差异。然而,对于TD模型,学习是时刻连续发生的,并非人为地局限于在试验中发生。因此,任何TD模型的一个关键特征是关于CS在时刻基础上的表征假设。在此,我们用一系列此前未探索过的经典条件作用任务来评估TD模型的性能。为此,我们考虑三种在时间泛化程度上有所不同的刺激表征,并评估这种表征如何影响TD模型在这些条件作用任务中的性能。

相似文献

1
Evaluating the TD model of classical conditioning.评估经典条件作用的TD模型。
Learn Behav. 2012 Sep;40(3):305-19. doi: 10.3758/s13420-012-0082-6.
2
Solving Pavlov's puzzle: attentional, associative, and flexible configural mechanisms in classical conditioning.解决巴甫洛夫的谜题:经典条件作用中的注意、联想和灵活的构型机制
Learn Behav. 2012 Sep;40(3):269-91. doi: 10.3758/s13420-012-0083-5.
3
Pavlovian conditioning under partial reinforcement: The effects of nonreinforced trials versus cumulative conditioned stimulus duration.部分强化条件作用下的巴甫洛夫条件反射:非强化试验与累积条件刺激时长的影响。
J Exp Psychol Anim Learn Cogn. 2020 Jul;46(3):256-272. doi: 10.1037/xan0000242. Epub 2020 Mar 12.
4
Real-time processing of serial stimuli in classical conditioning of the rabbit's nictitating membrane response.兔瞬膜反应经典条件反射中序列刺激的实时处理
J Exp Psychol Anim Behav Process. 1993 Jul;19(3):265-83.
5
Excitatory second-order conditioning using a backward first-order conditioned stimulus: A challenge for prediction error reduction.使用反向一阶条件刺激的兴奋性二阶条件作用:对预测误差减少的挑战。
Q J Exp Psychol (Hove). 2019 Jun;72(6):1453-1465. doi: 10.1177/1747021818793376. Epub 2018 Aug 21.
6
SSCC TD: a serial and simultaneous configural-cue compound stimuli representation for temporal difference learning.SSCC TD:一种用于时间差学习的串行和同时配置线索复合刺激表征
PLoS One. 2014 Jul 23;9(7):e102469. doi: 10.1371/journal.pone.0102469. eCollection 2014.
7
Conditioned response timing and integration in the cerebellum.小脑内的条件反应定时与整合
Learn Mem. 1997 May-Jun;4(1):116-29. doi: 10.1101/lm.4.1.116.
8
The importance of trials.试验的重要性。
J Exp Psychol Anim Learn Cogn. 2019 Oct;45(4):390-404. doi: 10.1037/xan0000223. Epub 2019 Aug 15.
9
Temporal-difference reinforcement learning with distributed representations.基于分布式表示的时间差分强化学习。
PLoS One. 2009 Oct 20;4(10):e7362. doi: 10.1371/journal.pone.0007362.
10
Why trace and delay conditioning are sometimes (but not always) hippocampal dependent: a computational model.为什么痕迹和延迟条件作用有时(但不总是)依赖于海马体:一个计算模型。
Brain Res. 2013 Feb 1;1493:48-67. doi: 10.1016/j.brainres.2012.11.020. Epub 2012 Nov 23.

引用本文的文献

1
A neurally constrained computational model of context-dependent fear extinction recall and relapse.一种与情境相关的恐惧消退回忆和复发的神经约束计算模型。
Commun Biol. 2025 Apr 26;8(1):668. doi: 10.1038/s42003-025-08107-7.
2
The devilish details affecting TDRL models in dopamine research.多巴胺研究中影响临时残疾评定量表(TDRL)模型的棘手细节。
Trends Cogn Sci. 2025 May;29(5):434-447. doi: 10.1016/j.tics.2025.02.001. Epub 2025 Feb 26.
3
Neurons of Macaque Frontal Eye Field Signal Reward-Related Surprise.猴额眼区神经元信号传递与奖赏相关的意外信息。

本文引用的文献

1
Timing in simple conditioning and occasion setting: a neural network approach.简单条件作用与情境设定中的时机:一种神经网络方法。
Behav Processes. 1999 Apr;45(1-3):33-57. doi: 10.1016/s0376-6357(99)00008-x.
2
Hippocampal "time cells" bridge the gap in memory for discontiguous events.海马体“时间细胞”弥合了不连续事件记忆中的空白。
Neuron. 2011 Aug 25;71(4):737-49. doi: 10.1016/j.neuron.2011.07.012.
3
Reinforcement learning, conditioning, and the brain: Successes and challenges.强化学习、条件作用与大脑:成就与挑战。
J Neurosci. 2024 Sep 18;44(38):e0441242024. doi: 10.1523/JNEUROSCI.0441-24.2024.
4
Learning to express reward prediction error-like dopaminergic activity requires plastic representations of time.学习表达类似于奖励预测误差的多巴胺能活动需要时间的可塑性表示。
Nat Commun. 2024 Jul 12;15(1):5856. doi: 10.1038/s41467-024-50205-3.
5
Temporal context effects on suboptimal choice.时间背景对次优选择的影响。
Psychon Bull Rev. 2024 Dec;31(6):2737-2745. doi: 10.3758/s13423-024-02519-y. Epub 2024 May 17.
6
Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.基于奖励的选项竞争在人类背侧流中,以及在连续空间中从随机探索到利用的转变。
Sci Adv. 2024 Feb 23;10(8):eadj2219. doi: 10.1126/sciadv.adj2219.
7
Further evidence for the role of temporal contiguity as a determinant of overshadowing.进一步证明时间接近性作为掩蔽作用决定因素的证据。
Q J Exp Psychol (Hove). 2024 Jul;77(7):1375-1389. doi: 10.1177/17470218231197170. Epub 2023 Sep 18.
8
Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model.腹侧被盖区的多巴胺预测误差反映了一个多线程的预测模型。
Nat Neurosci. 2023 May;26(5):830-839. doi: 10.1038/s41593-023-01310-x. Epub 2023 Apr 20.
9
From eye-blinks to state construction: Diagnostic benchmarks for online representation learning.从眨眼到状态构建:在线表征学习的诊断基准
Adapt Behav. 2023 Feb;31(1):3-19. doi: 10.1177/10597123221085039. Epub 2022 Apr 27.
10
Dopamine mediates the bidirectional update of interval timing.多巴胺介导间隔时间的双向更新。
Behav Neurosci. 2022 Oct;136(5):445-452. doi: 10.1037/bne0000529.
Cogn Affect Behav Neurosci. 2009 Dec;9(4):343-64. doi: 10.3758/CABN.9.4.343.
4
Neural representation of time in cortico-basal ganglia circuits.皮质基底节回路中的时间神经表示。
Proc Natl Acad Sci U S A. 2009 Nov 10;106(45):19156-61. doi: 10.1073/pnas.0909881106. Epub 2009 Oct 22.
5
Magnitude and timing of conditioned responses in delay and trace classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus).兔(穴兔)瞬膜反应延迟和痕迹经典条件反射中条件反应的幅度和时机
Behav Neurosci. 2009 Oct;123(5):1095-101. doi: 10.1037/a0017112.
6
Scalar timing varies with response magnitude in classical conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus).在兔(穴兔)瞬膜反应的经典条件反射中,标量计时随反应强度而变化。
Behav Neurosci. 2009 Feb;123(1):212-7. doi: 10.1037/a0014122.
7
Tripartite mechanism of extinction suggested by dopamine neuron activity and temporal difference model.由多巴胺神经元活动和时间差异模型提出的消退三方机制。
J Neurosci. 2008 Sep 24;28(39):9619-31. doi: 10.1523/JNEUROSCI.0255-08.2008.
8
Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.多巴胺系统模型中的刺激表征与奖励预测误差的时间安排。
Neural Comput. 2008 Dec;20(12):3034-54. doi: 10.1162/neco.2008.11-07-654.
9
CS-US temporal relations in blocking.阻断中的条件刺激-非条件刺激时间关系
Learn Behav. 2008 May;36(2):92-103. doi: 10.3758/lb.36.2.92.
10
Magnitude and timing of nictitating membrane movements during classical conditioning of the rabbit (Oryctolagus cuniculus).家兔(穴兔)经典条件反射过程中瞬膜运动的幅度和时间。
Behav Neurosci. 2008 Apr;122(2):471-6. doi: 10.1037/0735-7044.122.2.471.