Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.
Center for Brain Science, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.
Curr Opin Neurobiol. 2021 Apr;67:95-105. doi: 10.1016/j.conb.2020.08.014. Epub 2020 Nov 10.
In the brain, dopamine is thought to drive reward-based learning by signaling temporal difference reward prediction errors (TD errors), a 'teaching signal' used to train computers. Recent studies using optogenetic manipulations have provided multiple pieces of evidence supporting that phasic dopamine signals function as TD errors. Furthermore, novel experimental results have indicated that when the current state of the environment is uncertain, dopamine neurons compute TD errors using 'belief states' or a probability distribution over potential states. It remains unclear how belief states are computed but emerging evidence suggests involvement of the prefrontal cortex and the hippocampus. These results refine our understanding of the role of dopamine in learning and the algorithms by which dopamine functions in the brain.
在大脑中,多巴胺被认为通过信号传递时间差分奖励预测误差(TD 误差)来驱动基于奖励的学习,这是一种用于训练计算机的“教学信号”。最近使用光遗传学操作的研究提供了多项证据,支持了多巴胺的相位信号作为 TD 误差的功能。此外,新的实验结果表明,当环境的当前状态不确定时,多巴胺神经元使用“信念状态”或潜在状态的概率分布来计算 TD 误差。目前尚不清楚如何计算信念状态,但新出现的证据表明涉及前额叶皮层和海马体。这些结果完善了我们对多巴胺在学习中的作用以及多巴胺在大脑中发挥作用的算法的理解。