Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK.
Nat Commun. 2024 Sep 17;15(1):8138. doi: 10.1038/s41467-024-52311-8.
The dopamine reward prediction error signal is known to be subjective but has so far only been assessed in aggregate choices. However, personal choices fluctuate across trials and thus reflect the instantaneous subjective reward value. In the well-established Becker-DeGroot-Marschak (BDM) auction-like mechanism, participants are encouraged to place bids that accurately reveal their instantaneous subjective reward value; inaccurate bidding results in suboptimal reward ("incentive compatibility"). In our experiment, male rhesus monkeys became experienced over several years to place accurate BDM bids for juice rewards without specific external constraints. Their bids for physically identical rewards varied trial by trial and increased overall for larger rewards. In these highly experienced animals, responses of midbrain dopamine neurons followed the trial-by-trial variations of bids despite constant, explicitly predicted reward amounts. Inversely, dopamine responses were similar with similar bids for different physical reward amounts. Support Vector Regression demonstrated accurate prediction of the animals' bids by as few as twenty dopamine neurons. Thus, the phasic dopamine reward signal reflects instantaneous subjective reward value.
多巴胺奖励预测误差信号是主观的,但迄今为止仅在综合选择中进行了评估。然而,个人选择在试验中会波动,因此反映了即时的主观奖励价值。在成熟的贝克尔-德格罗特-马沙克(BDM)拍卖式机制中,鼓励参与者出价,以准确反映他们的即时主观奖励价值;出价不准确会导致次优奖励(“激励相容性”)。在我们的实验中,雄性恒河猴在几年的时间里积累了经验,能够在没有具体外部限制的情况下,为果汁奖励进行准确的 BDM 出价。他们对物理上相同的奖励的出价在每次试验中都有所不同,并且随着奖励金额的增加而总体增加。在这些经验丰富的动物中,中脑多巴胺神经元的反应尽管受到明确预测的奖励数量的影响,但仍与每次试验的出价变化保持一致。相反,对于不同的物理奖励金额,多巴胺反应相似。支持向量回归表明,仅通过二十个左右的多巴胺神经元就能准确预测动物的出价。因此,相位多巴胺奖励信号反映了即时的主观奖励价值。