Smith Kyle S, Graybiel Ann M
Department of Psychological and Brain Sciences, Dartmouth College, Hanover, New Hampshire; and
McGovern Institute for Brain Research and Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts.
J Neurophysiol. 2016 Mar;115(3):1487-98. doi: 10.1152/jn.00925.2015. Epub 2016 Jan 6.
Evaluating outcomes of behavior is a central function of the striatum. In circuits engaging the dorsomedial striatum, sensitivity to goal value is accentuated during learning, whereas outcome sensitivity is thought to be minimal in the dorsolateral striatum and its habit-related corticostriatal circuits. However, a distinct population of projection neurons in the dorsolateral striatum exhibits selective sensitivity to rewards. Here, we evaluated the outcome-related signaling in such neurons as rats performed an instructional T-maze task for two rewards. As the rats formed maze-running habits and then changed behavior after reward devaluation, we detected outcome-related spike activity in 116 units out of 1,479 recorded units. During initial training, nearly equal numbers of these units fired preferentially either after rewarded runs or after unrewarded runs, and the majority were responsive at only one of two reward locations. With overtraining, as habits formed, firing in nonrewarded trials almost disappeared, and reward-specific firing declined. Thus error-related signaling was lost, and reward signaling became generalized. Following reward devaluation, in an extinction test, postgoal activity was nearly undetectable, despite accurate running. Strikingly, when rewards were then returned, postgoal activity reappeared and recapitulated the original early response pattern, with nearly equal numbers responding to rewarded and unrewarded runs and to single rewards. These findings demonstrate that outcome evaluation in the dorsolateral striatum is highly plastic and tracks stages of behavioral exploration and exploitation. These signals could be a new target for understanding compulsive behaviors that involve changes to dorsal striatum function.
评估行为结果是纹状体的核心功能。在涉及背内侧纹状体的神经回路中,学习过程中对目标价值的敏感性会增强,而在背外侧纹状体及其与习惯相关的皮质纹状体回路中,结果敏感性被认为是最小的。然而,背外侧纹状体中一群独特的投射神经元对奖励表现出选择性敏感性。在此,我们评估了大鼠在执行用于两种奖励的指导性T迷宫任务时,这类神经元中与结果相关的信号传导。随着大鼠形成迷宫奔跑习惯并在奖励贬值后改变行为,我们在记录的1479个单元中的116个单元中检测到了与结果相关的峰电位活动。在初始训练期间,这些单元中几乎数量相等的单元在有奖励的奔跑后或无奖励的奔跑后优先放电,并且大多数仅在两个奖励位置之一做出反应。随着过度训练,习惯形成,无奖励试验中的放电几乎消失,奖励特异性放电减少。因此,与错误相关的信号消失了,奖励信号变得普遍化。在奖励贬值后,在消退测试中,尽管奔跑准确,但目标后活动几乎无法检测到。令人惊讶的是,当奖励随后恢复时,目标后活动重新出现并重现了最初的早期反应模式,对有奖励和无奖励奔跑以及单个奖励做出反应的数量几乎相等。这些发现表明,背外侧纹状体中的结果评估具有高度可塑性,并跟踪行为探索和利用的阶段。这些信号可能是理解涉及背侧纹状体功能变化的强迫行为的新靶点。