Hitchcock Peter F, Kim Joonhwa, Frank Michael J
Department of Psychology, Emory University.
Carney Institute for Brain Science, Brown University.
J Exp Psychol Gen. 2025 Sep 1. doi: 10.1037/xge0001817.
Humans learn adaptive behaviors via a durable but incremental reinforcement learning (RL) system and a fast but fleeting working memory (WM) system. Past work parsing these systems focused on reward learning alone; hence, little is known about how they interact while simultaneously learning to avoid punishment and whether arbitrating between these demands is disrupted by psychiatric symptoms. We administered a novel reward/punishment RL-WM task to an online sample oversampled for depression and anxiety symptoms ( = 298; n = 275 after quality control). Participants avoided punishment during initial learning, yet poorly retained this avoidance. Computational modeling captured this pattern via the fleeting WM system facilitating punishment avoidance, while the durable RL system retained little about punishment. Our task also included two test phases interleaved with learning, which permitted a targeted examination of past findings that WM blunts the RL system. When RL-based retention was tested midway through learning, we indeed found evidence of blunting. Yet, after learning resumed-leading to further prediction errors-blunting was no longer evident in a final test phase. However, individual differences moderated this effect: Some individuals were especially susceptible to blunting; for others, WM actually facilitated retention. Finally, task performance was largely spared as a function of depression/anxiety and trait rumination. Overall, our findings demonstrate that-when seeking to attain reward and avoid punishment concurrently-the WM system can facilitate short-term punishment avoidance while the RL system retains little about punishment, reveal individual differences in the extent to which WM blunts RL, and demonstrate intact behavior under internalizing-disorder symptoms. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
人类通过一个持久但渐进的强化学习(RL)系统和一个快速但短暂的工作记忆(WM)系统来学习适应性行为。过去解析这些系统的研究仅关注奖励学习;因此,对于它们在同时学习避免惩罚时如何相互作用,以及在这些需求之间进行仲裁是否会被精神症状扰乱,我们知之甚少。我们对一个因抑郁和焦虑症状而过度抽样的在线样本(N = 298;质量控制后n = 275)进行了一项新颖的奖励/惩罚RL-WM任务。参与者在初始学习期间避免了惩罚,但对这种避免的记忆很差。计算模型通过短暂的WM系统促进惩罚避免来捕捉这种模式,而持久的RL系统对惩罚的记忆很少。我们的任务还包括在学习过程中穿插的两个测试阶段,这使得我们能够有针对性地检验过去关于WM会削弱RL系统的研究结果。当在学习中途测试基于RL的记忆时,我们确实发现了削弱的证据。然而,在学习恢复导致进一步的预测误差后,在最后的测试阶段削弱不再明显。但是,个体差异调节了这种效应:一些个体特别容易受到削弱;而对于另一些个体,WM实际上促进了记忆。最后,任务表现很大程度上不受抑郁/焦虑和特质性反刍的影响。总体而言,我们的研究结果表明,当同时寻求获得奖励和避免惩罚时,WM系统可以促进短期的惩罚避免,而RL系统对惩罚的记忆很少,揭示了WM削弱RL程度的个体差异,并证明了在内化障碍症状下行为保持完好。(PsycInfo数据库记录(c)2025美国心理学会,保留所有权利)