Bayer Hannah M, Lau Brian, Glimcher Paul W
Center for Neural Science, New York University, 4 Washington Place, 809, New York, NY 10003, USA.
J Neurophysiol. 2007 Sep;98(3):1428-39. doi: 10.1152/jn.01140.2006. Epub 2007 Jul 5.
Work in behaving primates indicates that midbrain dopamine neurons encode a prediction error, the difference between an obtained reward and the reward expected. Studies of dopamine action potential timing in the alert and anesthetized rat indicate that dopamine neurons respond in tonic and phasic modes, a distinction that has been less well characterized in the primates. We used spike train models to examine the relationship between the tonic and burst modes of activity in dopamine neurons while monkeys were performing a reinforced visuo-saccadic movement task. We studied spiking activity during four task-related intervals; two of these were intervals during which no task-related events occurred, whereas two were periods marked by task-related phasic activity. We found that dopamine neuron spike trains during the intervals when no events occurred were well described as tonic. Action potentials appeared to be independent, to occur at low frequency, and to be almost equally well described by Gaussian and Poisson-like (gamma) processes. Unlike in the rat, interspike intervals as low as 20 ms were often observed during these presumptively tonic epochs. Having identified these periods of presumptively tonic activity, we were able to quantitatively define phasic modulations (both increases and decreases in activity) during the intervals in which task-related events occurred. This analysis revealed that the phasic modulations of these neurons include both bursting, as has been described previously, and pausing. Together bursts and pauses seemed to provide a continuous, although nonlinear, representation of the theoretically defined reward prediction error of reinforcement learning.
对行为中的灵长类动物的研究表明,中脑多巴胺神经元编码预测误差,即获得的奖励与预期奖励之间的差异。对警觉和麻醉大鼠多巴胺动作电位时间的研究表明,多巴胺神经元以紧张性和相位性模式做出反应,这种区别在灵长类动物中尚未得到很好的描述。我们使用尖峰序列模型来研究猴子在执行强化视觉扫视运动任务时多巴胺神经元活动的紧张性和爆发性模式之间的关系。我们研究了四个与任务相关的时间段内的放电活动;其中两个时间段没有发生与任务相关的事件,而另外两个时间段则以与任务相关的相位性活动为特征。我们发现,在没有事件发生的时间段内,多巴胺神经元的尖峰序列可以很好地描述为紧张性的。动作电位似乎是独立的,以低频发生,并且高斯过程和类泊松(伽马)过程对其描述的效果几乎相同。与大鼠不同,在这些假定的紧张性时期,经常观察到低至20毫秒的峰峰间期。在确定了这些假定的紧张性活动时期后,我们能够定量定义在与任务相关的事件发生的时间段内的相位性调制(活动的增加和减少)。该分析表明,这些神经元的相位性调制包括先前描述的爆发以及暂停。爆发和暂停共同似乎提供了强化学习中理论定义的奖励预测误差的连续(尽管是非线性的)表示。