Department of Psychiatry, University of Oxford, Oxford, United Kingdom; Oxford Health National Health Service Foundation Trust, Warneford Hospital, Oxford, United Kingdom.
Department of Psychiatry, University of Oxford, Oxford, United Kingdom; Clinical Division of Social Psychiatry, Department of Psychiatry and Psychotherapy, Medical University of Vienna, Vienna, Austria.
Biol Psychiatry. 2024 Feb 1;95(3):286-296. doi: 10.1016/j.biopsych.2023.05.023. Epub 2023 Jun 15.
Dopamine D-like agonists show promise as treatments for depression. They are thought to act by enhancing reward learning; however, the mechanisms by which they achieve this are not clear. Reinforcement learning accounts describe 3 distinct candidate mechanisms: increased reward sensitivity, increased inverse decision-temperature, and decreased value decay. As these mechanisms produce equivalent effects on behavior, arbitrating between them requires measurement of how expectations and prediction errors are altered. We characterized the effects of 2 weeks of the D-like agonist pramipexole on reward learning and used functional magnetic resonance imaging measures of expectation and prediction error to assess which of these 3 mechanistic processes were responsible for the behavioral effects.
Forty healthy volunteers (50% female) were randomized to 2 weeks of pramipexole (titrated to 1 mg/day) or placebo in a double-blind, between-subject design. Participants completed a probabilistic instrumental learning task before and after the pharmacological intervention, with functional magnetic resonance imaging data collected at the second visit. Asymptotic choice accuracy and a reinforcement learning model were used to assess reward learning.
Pramipexole increased choice accuracy in the reward condition with no effect on losses. Participants who received pramipexole had increased blood oxygen level-dependent response in the orbital frontal cortex during the expectation of win trials but decreased blood oxygen level-dependent response to reward prediction errors in the ventromedial prefrontal cortex. This pattern of results indicates that pramipexole enhances choice accuracy by reducing the decay of estimated values during reward learning.
The D-like receptor agonist pramipexole enhances reward learning by preserving learned values. This is a plausible mechanism for pramipexole's antidepressant effect.
多巴胺 D 样受体激动剂在治疗抑郁症方面显示出良好的前景。它们被认为通过增强奖励学习起作用;然而,它们实现这一目标的机制尚不清楚。强化学习理论描述了 3 种不同的候选机制:增加奖励敏感性、增加逆决策温度和减少价值衰减。由于这些机制对行为产生等效的影响,因此需要测量期望和预测误差是如何改变的,以在它们之间进行裁决。我们描述了 2 周的 D 样受体激动剂普拉克索对奖励学习的影响,并使用功能磁共振成像测量期望和预测误差,以评估这 3 种机制过程中的哪一种负责行为效应。
40 名健康志愿者(50%为女性)被随机分为 2 周普拉克索(滴定至 1mg/天)或安慰剂的双盲、组间设计。参与者在药物干预前后完成了一个概率性的工具性学习任务,并在第二次就诊时收集了功能磁共振成像数据。渐近选择准确性和强化学习模型用于评估奖励学习。
普拉克索增加了奖励条件下的选择准确性,对损失没有影响。接受普拉克索的参与者在赢取试验的预期中眶额皮层的血氧水平依赖性反应增加,但在腹内侧前额叶皮层的奖励预测误差的血氧水平依赖性反应减少。这种结果模式表明,普拉克索通过减少奖励学习过程中估计值的衰减来提高选择准确性。
D 样受体激动剂普拉克索通过保存习得的价值来增强奖励学习。这是普拉克索抗抑郁作用的一个合理机制。