Laboratoire de Neurobiologie de la Cognition, Université de Provence, Centre National de la Recherche Scientifique, France.
J Neurosci. 2011 Jan 26;31(4):1507-15. doi: 10.1523/JNEUROSCI.4880-10.2011.
The detection of differences between predictions and actual outcomes is important for associative learning and for selecting actions according to their potential future reward. There are reports that tonically active neurons (TANs) in the primate striatum may carry information about errors in the prediction of rewards. However, this property seems to be expressed in classical conditioning tasks but not during performance of an instrumental task. To address this issue, we recorded the activity of TANs in the putamen of two monkeys performing an instrumental task in which probabilistic rewarding outcomes were contingent on an action in block-design experiments. Behavioral evidence suggests that animals adjusted their performance according to the level of probability for reward on each trial block. We found that the TAN response to reward was stronger as the reward probability decreased; this effect was especially prominent on the late component of the pause-rebound pattern of typical response seen in these neurons. The responsiveness to reward omission was also increased with increasing reward probability, whereas there were no detectable effects on responses to the stimulus that triggered the movement. Overall, the modulation of TAN responses by reward probability appeared relatively weak compared with that observed previously in a probabilistic classical conditioning task using the same block design. These data indicate that instrumental conditioning was less effective at demonstrating prediction error signaling in TANs. We conclude that the sensitivity of the TAN system to reward probability depends on the specific learning situation in which animals experienced the stimulus-reward associations.
检测预测与实际结果之间的差异对于联想学习以及根据潜在未来奖励选择行动至关重要。有报道称,灵长类动物纹状体中的持续活动神经元(TAN)可能携带有关奖励预测错误的信息。然而,这种特性似乎在经典条件反射任务中表现出来,但在执行工具任务时却没有表现出来。为了解决这个问题,我们在两只猴子的壳核中记录了 TAN 的活动,这些猴子在一项工具任务中表现出色,在该任务中,概率性奖励结果取决于动作,在块设计实验中。行为证据表明,动物根据每个试验块的奖励概率水平调整了它们的表现。我们发现,TAN 对奖励的反应随着奖励概率的降低而增强;这种效应在这些神经元中看到的典型反应的暂停反弹模式的后期成分中尤为明显。随着奖励概率的增加,对奖励缺失的反应性也增加了,而对触发运动的刺激的反应则没有检测到明显的影响。总体而言,与之前使用相同块设计在概率经典条件反射任务中观察到的反应相比,TAN 反应受奖励概率的调制相对较弱。这些数据表明,工具条件反射在 TAN 中表现出预测误差信号的敏感性较低。我们得出的结论是,TAN 系统对奖励概率的敏感性取决于动物经历刺激 - 奖励关联的特定学习情况。