Kobayashi- Yasushi, Okada Ken-Ichi
Graduate School of Frontier Biosciences, Osaka University, 1-3 Machikaneyama, Toyonaka 560-8531, Japan.
Brain Nerve. 2009 Apr;61(4):397-404.
We address the role of neuronal activity in the pathways of the brainstem-midbrain circuit in reward and the basis for the hypothesis that this circuit provides advantages over previous reinforcement learning theories. Several lines of evidence support the reward-based learning theory proposing that midbrain dopamine (DA) neurons emit a teaching signal (the reward prediction error signal) to control synaptic plasticity of the projection area. However, the underlying mechanism of the location and manner in which the reward prediction error signal is computed remains unclear. Since the pedunculopontine tegmental nucleus (PPTN) in the brainstem is one of the strongest excitatory input sources to DA neurons, we hypothesized that the PPTN may play an important role in activating the DA neurons and reinforce learning by relaying necessary signals for reward prediction error computation to those neurons. To investigate the involvement of PPTN neurons in reward prediction error computation, we employed a visually guided saccade task while recording the neuronal activity in monkeys. Here, we predict that PPTN neurons may relay the excitatory component of tonic reward prediction and phasic primary reward signals, and derive a new computational theory of reward prediction error in DA neurons.
我们探讨了神经元活动在脑干-中脑回路的奖赏通路中的作用,以及该回路相较于先前强化学习理论具有优势这一假说的依据。多条证据支持基于奖赏的学习理论,该理论提出中脑多巴胺(DA)神经元发出一个教学信号(奖赏预测误差信号)来控制投射区域的突触可塑性。然而,奖赏预测误差信号的计算位置和方式的潜在机制仍不清楚。由于脑干中的脚桥被盖核(PPTN)是DA神经元最强的兴奋性输入源之一,我们推测PPTN可能通过将奖赏预测误差计算所需的信号传递给这些神经元,在激活DA神经元和强化学习中发挥重要作用。为了研究PPTN神经元在奖赏预测误差计算中的参与情况,我们在记录猴子神经元活动的同时,采用了视觉引导的扫视任务。在此,我们预测PPTN神经元可能传递持续性奖赏预测的兴奋性成分和相位性初级奖赏信号,并推导了DA神经元奖赏预测误差的一种新的计算理论。