Bushnell P J, Stanton M E
Neurotoxicology Division, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711.
Physiol Behav. 1991 Dec;50(6):1145-51. doi: 10.1016/0031-9384(91)90575-9.
Serial reversals of a spatial discrimination were trained in rats under automaintenance conditions, in which food reward occurred regardless of responding. This automaintained reversal learning was compared to instrumental reversal learning in other rats trained under a similar procedure which required responding for reward. In the automaintenance (AU) procedure, rats received food after every retraction of a "positive" response lever (S+); retraction of a second, "neutral" lever (So) was not paired with food delivery. Responses to the S+ were elicited at fairly constant rates during daily 100-trial conditioning sessions. Responses to the So occurred early in each session but rapidly diminished across trials. When the valences of the levers were reversed, responding shifted to the new S+ and diminished on the new So. Criterion for reversal was defined as a discrimination ratio (DR) of at least 90% responding to the S+ in two consecutive 10-trial blocks. With repeated reversals, acquisition of criterion performance occurred with increasing rapidity, reaching an asymptote below that required for the original discrimination. A second group of rats was trained on a similar instrumental schedule, in which at least one response to the S+ was required for food delivery. Response rates in this instrumental (IN) group were approximately double those of the AU group. However, ratios of S+ to So response rates were similar to those of the AU group, and the serial reversal curves generated were qualitatively similar. Thus rats can show improvement across serial reversals of a spatial discrimination based entirely on pairings of stimulus events (automaintenance), in a manner similar to that observed in instrumental procedures, in which reward is contingent upon correct responding.
在自动维持条件下对大鼠进行空间辨别连续反转训练,即无论大鼠是否做出反应都会给予食物奖励。将这种自动维持的反转学习与另一组在类似程序下训练的大鼠的工具性反转学习进行比较,后者需要做出反应才能获得奖励。在自动维持(AU)程序中,每次大鼠缩回“阳性”反应杆(S+)后都会获得食物;缩回第二个“中性”杆(So)则不给予食物。在每日100次试验的条件训练过程中,对S+的反应速率相当稳定。对So的反应在每个训练阶段开始时出现,但在试验过程中迅速减少。当杆的效价反转时,反应转移到新的S+,而在新的So上减少。反转标准定义为在两个连续的10次试验块中,对S+的反应的辨别率(DR)至少为90%。随着反转的重复进行,达到标准表现的速度越来越快,最终达到一个低于原始辨别所需的渐近线。第二组大鼠按照类似的工具性时间表进行训练,即至少对S+做出一次反应才能获得食物。这个工具性(IN)组的反应速率大约是AU组的两倍。然而,S+与So的反应速率之比与AU组相似,并且生成的连续反转曲线在性质上相似。因此,大鼠能够以类似于在工具性程序中观察到的方式,完全基于刺激事件的配对(自动维持),在空间辨别连续反转中表现出进步,在工具性程序中奖励取决于正确反应。