McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA.
Neuroscience. 2011 Dec 15;198:152-70. doi: 10.1016/j.neuroscience.2011.09.069. Epub 2011 Oct 13.
Most of our motor skills are not innately programmed, but are learned by a combination of motor exploration and performance evaluation, suggesting that they proceed through a reinforcement learning (RL) mechanism. Songbirds have emerged as a model system to study how a complex behavioral sequence can be learned through an RL-like strategy. Interestingly, like motor sequence learning in mammals, song learning in birds requires a basal ganglia (BG)-thalamocortical loop, suggesting common neural mechanisms. Here, we outline a specific working hypothesis for how BG-forebrain circuits could utilize an internally computed reinforcement signal to direct song learning. Our model includes a number of general concepts borrowed from the mammalian BG literature, including a dopaminergic reward prediction error and dopamine-mediated plasticity at corticostriatal synapses. We also invoke a number of conceptual advances arising from recent observations in the songbird. Specifically, there is evidence for a specialized cortical circuit that adds trial-to-trial variability to stereotyped cortical motor programs, and a role for the BG in "biasing" this variability to improve behavioral performance. This BG-dependent "premotor bias" may in turn guide plasticity in downstream cortical synapses to consolidate recently learned song changes. Given the similarity between mammalian and songbird BG-thalamocortical circuits, our model for the role of the BG in this process may have broader relevance to mammalian BG function.
我们的大多数运动技能并非天生编程,而是通过运动探索和表现评估相结合的方式习得的,这表明它们通过强化学习(RL)机制进行。鸣禽已成为研究复杂行为序列如何通过类似 RL 的策略学习的模型系统。有趣的是,与哺乳动物的运动序列学习一样,鸟类的歌唱学习需要基底神经节(BG)-丘脑皮质回路,这表明存在共同的神经机制。在这里,我们概述了一个具体的工作假设,即 BG-大脑前回路如何利用内部计算的强化信号来指导歌唱学习。我们的模型借鉴了哺乳动物 BG 文献中的许多一般概念,包括多巴胺能奖励预测误差和多巴胺介导的皮质纹状体突触可塑性。我们还援引了一些来自最近在鸣禽中观察到的概念性进展。具体来说,有证据表明存在一个专门的皮质回路,它为刻板的皮质运动程序增加了逐次试验的可变性,并且 BG 在“偏向”这种可变性以提高行为表现方面发挥作用。这种 BG 依赖性的“前运动偏向”反过来可能指导下游皮质突触的可塑性,以巩固最近学习的歌曲变化。鉴于哺乳动物和鸣禽 BG-丘脑皮质回路之间的相似性,我们关于 BG 在该过程中作用的模型可能对哺乳动物 BG 功能具有更广泛的相关性。