Watabe-Uchida Mitsuko, Eshel Neir, Uchida Naoshige
Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138; email:
Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305; email:
Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.
Dopamine neurons facilitate learning by calculating reward prediction error, or the difference between expected and actual reward. Despite two decades of research, it remains unclear how dopamine neurons make this calculation. Here we review studies that tackle this problem from a diverse set of approaches, from anatomy to electrophysiology to computational modeling and behavior. Several patterns emerge from this synthesis: that dopamine neurons themselves calculate reward prediction error, rather than inherit it passively from upstream regions; that they combine multiple separate and redundant inputs, which are themselves interconnected in a dense recurrent network; and that despite the complexity of inputs, the output from dopamine neurons is remarkably homogeneous and robust. The more we study this simple arithmetic computation, the knottier it appears to be, suggesting a daunting (but stimulating) path ahead for neuroscience more generally.
多巴胺神经元通过计算奖励预测误差,即预期奖励与实际奖励之间的差异,来促进学习。尽管经过了二十年的研究,但多巴胺神经元如何进行这种计算仍不清楚。在这里,我们回顾了从解剖学、电生理学、计算建模到行为学等多种不同方法来解决这个问题的研究。从这种综合研究中出现了几种模式:多巴胺神经元自身计算奖励预测误差,而不是从上游区域被动继承;它们整合多个独立且冗余的输入,这些输入本身在密集的循环网络中相互连接;尽管输入复杂,但多巴胺神经元的输出却非常均匀且稳健。我们对这种简单算术计算研究得越多,它似乎就越复杂,这表明更广泛地说,神经科学面临着一条令人生畏(但令人兴奋)的道路。