Tan Can Ozan, Bullock Daniel
Cognitive and Neural Systems Department, Boston University, Boston, Massachusetts 02215, USA.
J Neurosci. 2008 Oct 1;28(40):10062-74. doi: 10.1523/JNEUROSCI.0259-08.2008.
Recently, dopamine (DA) neurons of the substantia nigra pars compacta (SNc) were found to exhibit sustained responses related to reward uncertainty, in addition to the phasic responses related to reward-prediction errors (RPEs). Thus, cue-dependent anticipations of the timing, magnitude, and uncertainty of rewards are learned and reflected in components of DA signals. Here we simulate a local circuit model to show how learned uncertainty responses are generated, along with phasic RPE responses, on single trials. Both types of simulated DA responses exhibit the empirically observed dependencies on conditional probability, expected value of reward, and time since onset of the reward-predicting cue. The model's three major pathways compute expected values of cues, timed predictions of reward magnitudes, and uncertainties associated with these predictions. The first two pathways' computations refine those modeled by Brown et al. (1999). The third, newly modeled, pathway involves medium spiny projection neurons (MSPNs) of the striatal matrix, whose axons corelease GABA and substance P, both at synapses with GABAergic neurons in the substantia nigra pars reticulata (SNr) and with distal dendrites (in SNr) of DA neurons whose somas are located in ventral SNc. Corelease enables efficient computation of uncertainty responses that are a nonmonotonic function of the conditional probability of reward, and variability in striatal cholinergic transmission can explain observed individual differences in the amplitudes of uncertainty responses. The involvement of matricial MSPNs and cholinergic transmission within the striatum implies a relation between uncertainty in cue-reward contingencies and action-selection functions of the basal ganglia.
最近,人们发现黑质致密部(SNc)的多巴胺(DA)神经元除了表现出与奖励预测误差(RPE)相关的相位反应外,还表现出与奖励不确定性相关的持续反应。因此,与线索相关的对奖励时间、大小和不确定性的预期在DA信号成分中得到学习和体现。在此,我们模拟了一个局部回路模型,以展示在单次试验中如何产生学习到的不确定性反应以及相位RPE反应。两种类型的模拟DA反应均表现出基于经验观察到的对条件概率、奖励期望值以及自奖励预测线索开始后的时间的依赖性。该模型的三条主要通路计算线索的期望值、奖励大小的定时预测以及与这些预测相关的不确定性。前两条通路的计算完善了Brown等人(1999年)所建立的模型。第三条新建立的通路涉及纹状体基质的中等棘状投射神经元(MSPNs),其轴突在与黑质网状部(SNr)的GABA能神经元以及与位于腹侧SNc的DA神经元的远端树突(在SNr中)形成的突触处共同释放GABA和P物质。共同释放能够高效计算不确定性反应,这种反应是奖励条件概率的非单调函数,并且纹状体胆碱能传递的变异性可以解释观察到的不确定性反应幅度的个体差异。纹状体基质MSPNs和胆碱能传递的参与意味着线索 - 奖励偶联的不确定性与基底神经节的动作选择功能之间存在关联。