Eshel Neir, Bukwich Michael, Rao Vinod, Hemmelder Vivian, Tian Ju, Uchida Naoshige
Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA.
Nature. 2015 Sep 10;525(7568):243-6. doi: 10.1038/nature14855. Epub 2015 Aug 31.
Dopamine neurons are thought to facilitate learning by comparing actual and expected reward. Despite two decades of investigation, little is known about how this comparison is made. To determine how dopamine neurons calculate prediction error, we combined optogenetic manipulations with extracellular recordings in the ventral tegmental area while mice engaged in classical conditioning. Here we demonstrate, by manipulating the temporal expectation of reward, that dopamine neurons perform subtraction, a computation that is ideal for reinforcement learning but rarely observed in the brain. Furthermore, selectively exciting and inhibiting neighbouring GABA (γ-aminobutyric acid) neurons in the ventral tegmental area reveals that these neurons are a source of subtraction: they inhibit dopamine neurons when reward is expected, causally contributing to prediction-error calculations. Finally, bilaterally stimulating ventral tegmental area GABA neurons dramatically reduces anticipatory licking to conditioned odours, consistent with an important role for these neurons in reinforcement learning. Together, our results uncover the arithmetic and local circuitry underlying dopamine prediction errors.
多巴胺神经元被认为通过比较实际奖励和预期奖励来促进学习。尽管经过了二十年的研究,但对于这种比较是如何进行的却知之甚少。为了确定多巴胺神经元如何计算预测误差,我们在小鼠进行经典条件反射时,将光遗传学操作与腹侧被盖区的细胞外记录相结合。在这里,我们通过操纵奖励的时间预期证明,多巴胺神经元进行减法运算,这是一种对强化学习非常理想但在大脑中很少观察到的计算方式。此外,选择性地兴奋和抑制腹侧被盖区相邻的GABA(γ-氨基丁酸)神经元表明,这些神经元是减法运算的来源:当预期奖励时,它们会抑制多巴胺神经元,从而对预测误差计算产生因果影响。最后,双侧刺激腹侧被盖区的GABA神经元会显著减少对条件气味的预期舔舐,这与这些神经元在强化学习中的重要作用一致。总之,我们的结果揭示了多巴胺预测误差背后的算法和局部神经回路。