Takikawa Yoriko, Kawagoe Reiko, Hikosaka Okihide
Department of Physiology, Juntendo University, School of Medicine, Tokyo 113-8421, Japan.
J Neurophysiol. 2004 Oct;92(4):2520-9. doi: 10.1152/jn.00238.2004. Epub 2004 May 26.
Dopamine (DA) neurons respond to sensory stimuli that predict reward. To understand how DA neurons acquire such ability, we trained monkeys on a one-direction-rewarded version of memory-guided saccade task (1DR) only when we recorded from single DA neurons. In 1DR, position-reward mapping was changed across blocks of trials. In the early stage of training of 1DR, DA neurons responded to reward delivery; in the later stages, they responded predominantly to the visual cue that predicted reward or no reward (reward predictor) differentially. We found that such a shift of activity from reward to reward predictor also occurred within a block of trials after position-reward mapping was altered. A main effect of long-term training was to accelerate the within-block reward-to-predictor shift of DA neuronal responses. The within-block shift appeared first in the intermediate stage, but was slow, and DA neurons often responded to the cue that indicated reward in the preceding block. In the advanced stage, the reward-to-predictor shift occurred quickly such that the DA neurons' responses to visual cues faithfully matched the current position-reward mapping. Changes in the DA neuronal responses co-varied with the reward-predictive differentiation of saccade latency both in short-term (within-block) and long-term adaptation. DA neurons' response to the fixation point also underwent long-term changes until it occurred predominantly in the first trial within a block. This might trigger a switch between the learned sets. These results suggest that midbrain DA neurons play an essential role in adapting oculomotor behavior to frequent switches in position-reward mapping.
多巴胺(DA)神经元对预测奖励的感觉刺激做出反应。为了了解DA神经元如何获得这种能力,我们仅在记录单个DA神经元时,让猴子在记忆引导扫视任务(1DR)的单向奖励版本上进行训练。在1DR中,位置-奖励映射在不同的试验块中发生变化。在1DR训练的早期阶段,DA神经元对奖励发放做出反应;在后期阶段,它们主要对预测奖励或无奖励(奖励预测器)的视觉线索做出不同反应。我们发现,在位置-奖励映射改变后,这种活动从奖励向奖励预测器的转变也会在一个试验块内发生。长期训练的一个主要作用是加速DA神经元反应在试验块内从奖励向预测器的转变。试验块内的转变首先出现在中期阶段,但速度较慢,并且DA神经元经常对指示前一个试验块中奖励的线索做出反应。在后期阶段,从奖励向预测器的转变迅速发生,使得DA神经元对视觉线索的反应忠实地匹配当前的位置-奖励映射。在短期(试验块内)和长期适应过程中,DA神经元反应的变化与扫视潜伏期的奖励预测分化共同变化。DA神经元对注视点的反应也经历了长期变化,直到它主要出现在一个试验块内的第一次试验中。这可能会触发学习集之间的切换。这些结果表明,中脑DA神经元在使眼球运动行为适应位置-奖励映射的频繁切换中起着至关重要的作用。