Leibniz Institute for Neurobiology, Brenneckestr. 6, D-39118 Magdeburg, Germany.
Neuroscience. 2010 Mar 31;166(3):752-62. doi: 10.1016/j.neuroscience.2010.01.010. Epub 2010 Jan 19.
Learned changes in behavior can be elicited by either appetitive or aversive reinforcers. It is, however, not clear whether the two types of motivation, (approaching appetitive stimuli and avoiding aversive stimuli) drive learning in the same or different ways, nor is their interaction understood in situations where the two types are combined in a single experiment. To investigate this question we have developed a novel learning paradigm for Mongolian gerbils, which not only allows rewards and punishments to be presented in isolation or in combination with each other, but also can use these opposite reinforcers to drive the same learned behavior. Specifically, we studied learning of tone-conditioned hurdle crossing in a shuttle box driven by either an appetitive reinforcer (brain stimulation reward) or an aversive reinforcer (electrical footshock), or by a combination of both. Combination of the two reinforcers potentiated speed of acquisition, led to maximum possible performance, and delayed extinction as compared to either reinforcer alone. Additional experiments, using partial reinforcement protocols and experiments in which one of the reinforcers was omitted after the animals had been previously trained with the combination of both reinforcers, indicated that appetitive and aversive reinforcers operated together but acted in different ways: in this particular experimental context, punishment appeared to be more effective for initial acquisition and reward more effective to maintain a high level of conditioned responses (CRs). The results imply that learning mechanisms in problem solving were maximally effective when the initial punishment of mistakes was combined with the subsequent rewarding of correct performance.
习得的行为变化可以通过正强化或负强化来引发。然而,目前尚不清楚这两种动机(接近正强化刺激和回避负强化刺激)是否以相同或不同的方式驱动学习,也不清楚在两种动机在单个实验中结合的情况下,它们之间的相互作用是如何理解的。为了研究这个问题,我们为蒙古沙鼠开发了一种新的学习范式,它不仅可以单独或组合呈现奖励和惩罚,还可以使用这两种相反的强化物来驱动相同的学习行为。具体来说,我们研究了在穿梭箱中通过正强化(脑刺激奖励)或负强化(电击)或两者的组合来驱动的音调条件回避跳跃的学习。与单一强化物相比,两种强化物的组合促进了习得速度,达到了最大可能的表现,并延缓了消退。使用部分强化方案和在动物先前接受两种强化物组合训练后省略一种强化物的实验,进一步表明,正强化和负强化一起起作用,但作用方式不同:在这种特定的实验环境下,惩罚似乎更有利于初始习得,而奖励则更有利于维持高水平的条件反应(CR)。结果表明,在解决问题的学习机制中,当最初对错误的惩罚与随后对正确表现的奖励相结合时,效果最佳。