Department of Bio and Brain Engineering, KAIST (Korea Advanced Institute of Science and Technology), Daejeon 305-701, Republic of Korea.
J Neurosci. 2013 Mar 13;33(11):4710-25. doi: 10.1523/JNEUROSCI.3883-12.2013.
The transient response of dopamine neurons has been described as reward prediction error (RPE), with activation or suppression by events that are better or worse than expected, respectively. However, at least a minority of neurons are activated by aversive or high-intensity stimuli, casting doubt on the generality of RPE in describing the dopamine signal. To overcome limitations of previous studies, we studied neuronal responses to a wider variety of high-intensity and aversive stimuli, and we quantified and controlled aversiveness through a choice task in which macaques sacrificed juice to avoid aversive stimuli. Whereas most previous work has portrayed the RPE as a single impulse or "phase," here we demonstrate its multiphasic temporal dynamics. Aversive or high-intensity stimuli evoked a triphasic sequence of activation-suppression-activation extending over a period of 40-700 ms. The initial activation at short latencies (40-120 ms) reflected sensory intensity. The influence of motivational value became dominant between 150 and 250 ms, with activation in the case of appetitive stimuli, and suppression in the case of aversive and neutral stimuli. The previously unreported late activation appeared to be a modest "rebound" after strong suppression. Similarly, strong activation by reward was often followed by suppression. We suggest that these "rebounds" may result from overcompensation by homeostatic mechanisms in some cells. Our results are consistent with a realistic RPE, which evolves over time through a dynamic balance of excitation and inhibition.
多巴胺神经元的瞬态反应被描述为奖励预测误差(RPE),其分别由超出或低于预期的事件激活或抑制。然而,至少有一小部分神经元会被厌恶或高强度的刺激激活,这对 RPE 在描述多巴胺信号中的普遍性提出了质疑。为了克服之前研究的局限性,我们研究了神经元对更广泛的高强度和厌恶刺激的反应,并且我们通过猴子牺牲果汁来避免厌恶刺激的选择任务来量化和控制厌恶感。虽然之前的大多数工作将 RPE 描绘为单一冲动或“相位”,但在这里我们展示了其多相的时间动态。厌恶或高强度刺激会引发一个三阶段的激活-抑制-激活序列,持续 40-700 毫秒。短潜伏期(40-120 毫秒)的初始激活反映了感觉强度。在 150 到 250 毫秒之间,动机价值的影响变得占主导地位,在奖赏刺激的情况下会激活,在厌恶和中性刺激的情况下会抑制。以前未报告的晚期激活似乎是强烈抑制后的适度“反弹”。类似地,强烈的奖励激活通常会伴随着抑制。我们认为,这些“反弹”可能是某些细胞中的内稳态机制过度补偿的结果。我们的结果与现实的 RPE 一致,它通过兴奋和抑制的动态平衡随时间演变。