奖励和惩罚在引导行为方面起着截然不同的作用。

Reward and punishment act as distinct factors in guiding behavior.

作者信息

Kubanek Jan, Snyder Lawrence H, Abrams Richard A

机构信息

Department of Anatomy and Neurobiology, Washington University School of Medicine, St. Louis, MO 63110, USA.

出版信息

Cognition. 2015 Jun;139:154-67. doi: 10.1016/j.cognition.2015.03.005. Epub 2015 Mar 28.

DOI:10.1016/j.cognition.2015.03.005

PMID:25824862

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4397189/

Abstract

Behavior rests on the experience of reinforcement and punishment. It has been unclear whether reinforcement and punishment act as oppositely valenced components of a single behavioral factor, or whether these two kinds of outcomes play fundamentally distinct behavioral roles. To this end, we varied the magnitude of a reward or a penalty experienced following a choice using monetary tokens. The outcome of each trial was independent of the outcome of the previous trial, which enabled us to isolate and study the effect on behavior of each outcome magnitude in single trials. We found that a reward led to a repetition of the previous choice, whereas a penalty led to an avoidance of the previous choice. Surprisingly, the effects of the reward magnitude and the penalty magnitude revealed a pronounced asymmetry. The choice repetition effect of a reward scaled with the magnitude of the reward. In a marked contrast, the avoidance effect of a penalty was flat, not influenced by the magnitude of the penalty. These effects were mechanistically described using a reinforcement learning model after the model was updated to account for the penalty-based asymmetry. The asymmetry in the effects of the reward magnitude and the punishment magnitude was so striking that it is difficult to conceive that one factor is just a weighted or transformed form of the other factor. Instead, the data suggest that rewards and penalties are fundamentally distinct factors in governing behavior.

摘要

行为取决于强化和惩罚的体验。目前尚不清楚强化和惩罚是作为单一行为因素的具有相反效价的组成部分，还是这两种结果在行为中发挥着根本不同的作用。为此，我们使用货币代币改变了选择后所体验到的奖励或惩罚的大小。每次试验的结果都独立于前一次试验的结果，这使我们能够在单次试验中分离并研究每个结果大小对行为的影响。我们发现奖励会导致重复前一次的选择，而惩罚会导致避免前一次的选择。令人惊讶的是，奖励大小和惩罚大小的影响显示出明显的不对称性。奖励的选择重复效应随奖励大小而变化。与之形成鲜明对比的是，惩罚的回避效应是平缓的，不受惩罚大小的影响。在对强化学习模型进行更新以解释基于惩罚的不对称性之后，使用该模型从机制上描述了这些效应。奖励大小和惩罚大小的效应中的不对称性非常显著，以至于很难想象一个因素只是另一个因素的加权或变换形式。相反，数据表明奖励和惩罚在行为控制中是根本不同的因素。

相似文献

Reward and punishment act as distinct factors in guiding behavior.奖励和惩罚在引导行为方面起着截然不同的作用。

Cognition. 2015 Jun;139:154-67. doi: 10.1016/j.cognition.2015.03.005. Epub 2015 Mar 28.

The effects of response-cost punishment on instructional control during a choice task.反应代价惩罚对选择任务中教学控制的影响。

J Exp Anal Behav. 2013 May;99(3):346-61. doi: 10.1002/jeab.20. Epub 2013 Feb 13.

Decision-making in ADHD: sensitive to frequency but blind to the magnitude of penalty?注意力缺陷多动障碍中的决策：对频率敏感但对惩罚程度盲目？

J Child Psychol Psychiatry. 2008 Jul;49(7):712-22. doi: 10.1111/j.1469-7610.2008.01910.x. Epub 2008 Jul 1.

Reward and avoidance learning in the context of aversive environments and possible implications for depressive symptoms.在厌恶环境背景下的奖励和回避学习及其对抑郁症状的可能影响。

Psychopharmacology (Berl). 2019 Aug;236(8):2437-2449. doi: 10.1007/s00213-019-05299-9. Epub 2019 Jun 28.

Impaired decision making in oppositional defiant disorder related to altered psychophysiological responses to reinforcement.对立违抗障碍患者的决策能力受损与强化的心理生理反应改变有关。

Biol Psychiatry. 2010 Aug 15;68(4):337-44. doi: 10.1016/j.biopsych.2009.12.037. Epub 2010 Mar 31.

Effects of reward and punishment on learning from errors in smokers.奖惩对吸烟者从错误中学习的影响。

Drug Alcohol Depend. 2018 Jul 1;188:32-38. doi: 10.1016/j.drugalcdep.2018.03.028. Epub 2018 Apr 30.

Clin Neurophysiol. 2010 Jan;121(1):60-76. doi: 10.1016/j.clinph.2009.10.004. Epub 2009 Nov 8.

Individual differences in sensitivity to reward and punishment and neural activity during reward and avoidance learning.奖励与惩罚敏感性的个体差异以及奖励与回避学习过程中的神经活动。

Soc Cogn Affect Neurosci. 2015 Sep;10(9):1219-27. doi: 10.1093/scan/nsv007. Epub 2015 Feb 12.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策：强化学习预测错误在人类中的快速传播。

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Dissociable roles for the basolateral amygdala and orbitofrontal cortex in decision-making under risk of punishment.基底外侧杏仁核和眶额皮质在惩罚风险下决策中的不同作用。

J Neurosci. 2015 Jan 28;35(4):1368-79. doi: 10.1523/JNEUROSCI.3586-14.2015.

引用本文的文献

Sensorimotor faculties bias choice behavior.感觉运动能力会影响选择行为。

Front Psychol. 2025 Mar 28;16:1432996. doi: 10.3389/fpsyg.2025.1432996. eCollection 2025.

Differential discounting of past and future gains and losses in individuals in recovery from substance use disorder.物质使用障碍康复个体对过去和未来收益与损失的差异贴现。

Exp Clin Psychopharmacol. 2025 Jun;33(3):291-299. doi: 10.1037/pha0000769. Epub 2025 Mar 3.

When is a causal illusion an illusion? Separating discriminability and bias in human contingency judgements.因果错觉何时成为一种错觉？区分人类偶然性判断中的可辨别性和偏差。

Q J Exp Psychol (Hove). 2024 Nov 19;78(9):17470218241293418. doi: 10.1177/17470218241293418.

Don't Give-Up: Why some intervention schemes encourage suboptimal behavior.不要放弃：为何一些干预方案会助长次优行为。

Psychon Bull Rev. 2025 Feb;32(1):363-372. doi: 10.3758/s13423-024-02537-w. Epub 2024 Jul 23.

Decision-making style explains the withdrawal behavior of shy individuals: evidence from Chinese college students.决策风格解释了害羞个体的退缩行为：来自中国大学生的证据。

Front Psychol. 2023 Dec 22;14:1292096. doi: 10.3389/fpsyg.2023.1292096. eCollection 2023.

Flexible control of representational dynamics in a disinhibition-based model of decision-making.基于去抑制的决策模型中表象动力学的灵活控制。

Elife. 2023 Jun 1;12:e82426. doi: 10.7554/eLife.82426.

Nat Commun. 2023 Apr 21;14(1):2284. doi: 10.1038/s41467-023-38025-3.

Primary rewards and aversive outcomes have comparable effects on attentional bias.正性奖赏和负性结果对注意偏向具有同等影响。

Behav Neurosci. 2023 Apr;137(2):89-94. doi: 10.1037/bne0000543. Epub 2022 Dec 15.

Serotonin modulates asymmetric learning from reward and punishment in healthy human volunteers.血清素调节健康人类志愿者从奖惩中进行不对称学习。

Commun Biol. 2022 Aug 12;5(1):812. doi: 10.1038/s42003-022-03690-5.

Assessing behavioural profiles following neutral, positive and negative feedback.评估中性、正性和负性反馈后的行为特征。

PLoS One. 2022 Jul 5;17(7):e0270475. doi: 10.1371/journal.pone.0270475. eCollection 2022.

本文引用的文献

Two dimensions of value: dopamine neurons represent reward but not aversiveness.两个维度的价值：多巴胺神经元代表奖励而非厌恶。

Science. 2013 Aug 2;341(6145):546-9. doi: 10.1126/science.1238699.

A low-frequency oscillatory neural signal in humans encodes a developing decision variable.人类的低频振荡神经信号编码了一个正在发展的决策变量。

Neuroimage. 2013 Dec;83:795-808. doi: 10.1016/j.neuroimage.2013.06.085. Epub 2013 Jul 18.

Heterogeneity of strategy use in the Iowa gambling task: a comparison of win-stay/lose-shift and reinforcement learning models.策略使用的异质性在爱荷华赌博任务中：赢留输变和强化学习模型的比较。

Psychon Bull Rev. 2013 Apr;20(2):364-71. doi: 10.3758/s13423-012-0324-9.

Losses as modulators of attention: review and analysis of the unique effects of losses over gains.损失作为注意力的调节剂：对损失相对于收益的独特影响的回顾与分析。

Psychol Bull. 2013 Mar;139(2):497-518. doi: 10.1037/a0029383. Epub 2012 Jul 23.

Comparison of decision learning models using the generalization criterion method.基于泛化准则方法的决策学习模型比较。

Cogn Sci. 2008 Dec;32(8):1376-402. doi: 10.1080/03640210802352992.

Token reinforcement: a review and analysis.代币强化：综述与分析

J Exp Anal Behav. 2009 Mar;91(2):257-86. doi: 10.1901/jeab.2009.91-257.

Lateral intraparietal cortex and reinforcement learning during a mixed-strategy game.混合策略游戏中的顶内沟外侧皮质与强化学习

J Neurosci. 2009 Jun 3;29(22):7278-89. doi: 10.1523/JNEUROSCI.1479-09.2009.

Asymmetry of reinforcement and punishment in human choice.人类选择中强化与惩罚的不对称性。

J Exp Anal Behav. 2008 Mar;89(2):157-67. doi: 10.1901/jeab.2008.89-157.

Behavioral dopamine signals.行为多巴胺信号。

Trends Neurosci. 2007 May;30(5):203-10. doi: 10.1016/j.tins.2007.03.007. Epub 2007 Apr 2.

Choice, changeover, and travel: A quantitative model.选择、转换和旅行：一个定量模型。

J Exp Anal Behav. 1991 Jan;55(1):47-61. doi: 10.1901/jeab.1991.55-47.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验