Kubanek Jan, Snyder Lawrence H, Abrams Richard A
Department of Anatomy and Neurobiology, Washington University School of Medicine, St. Louis, MO 63110, USA.
Department of Anatomy and Neurobiology, Washington University School of Medicine, St. Louis, MO 63110, USA.
Cognition. 2015 Jun;139:154-67. doi: 10.1016/j.cognition.2015.03.005. Epub 2015 Mar 28.
Behavior rests on the experience of reinforcement and punishment. It has been unclear whether reinforcement and punishment act as oppositely valenced components of a single behavioral factor, or whether these two kinds of outcomes play fundamentally distinct behavioral roles. To this end, we varied the magnitude of a reward or a penalty experienced following a choice using monetary tokens. The outcome of each trial was independent of the outcome of the previous trial, which enabled us to isolate and study the effect on behavior of each outcome magnitude in single trials. We found that a reward led to a repetition of the previous choice, whereas a penalty led to an avoidance of the previous choice. Surprisingly, the effects of the reward magnitude and the penalty magnitude revealed a pronounced asymmetry. The choice repetition effect of a reward scaled with the magnitude of the reward. In a marked contrast, the avoidance effect of a penalty was flat, not influenced by the magnitude of the penalty. These effects were mechanistically described using a reinforcement learning model after the model was updated to account for the penalty-based asymmetry. The asymmetry in the effects of the reward magnitude and the punishment magnitude was so striking that it is difficult to conceive that one factor is just a weighted or transformed form of the other factor. Instead, the data suggest that rewards and penalties are fundamentally distinct factors in governing behavior.
行为取决于强化和惩罚的体验。目前尚不清楚强化和惩罚是作为单一行为因素的具有相反效价的组成部分,还是这两种结果在行为中发挥着根本不同的作用。为此,我们使用货币代币改变了选择后所体验到的奖励或惩罚的大小。每次试验的结果都独立于前一次试验的结果,这使我们能够在单次试验中分离并研究每个结果大小对行为的影响。我们发现奖励会导致重复前一次的选择,而惩罚会导致避免前一次的选择。令人惊讶的是,奖励大小和惩罚大小的影响显示出明显的不对称性。奖励的选择重复效应随奖励大小而变化。与之形成鲜明对比的是,惩罚的回避效应是平缓的,不受惩罚大小的影响。在对强化学习模型进行更新以解释基于惩罚的不对称性之后,使用该模型从机制上描述了这些效应。奖励大小和惩罚大小的效应中的不对称性非常显著,以至于很难想象一个因素只是另一个因素的加权或变换形式。相反,数据表明奖励和惩罚在行为控制中是根本不同的因素。