Department of Psychiatry, University of Oxford, Oxford, UK.
University College London, Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, UK.
Psychol Med. 2024 Feb;54(3):631-636. doi: 10.1017/S0033291723002520. Epub 2023 Sep 14.
Learning from rewarded and punished choices is perturbed in depressed patients, suggesting that abnormal reinforcement learning may be a cognitive mechanism of the illness. However, previous studies have disagreed about whether this behavior is produced by alterations in the rate of learning or sensitivity to experienced outcomes. This previous work has generally assessed learning in response to binary outcomes of one valence, rather than to both rewarding and punishing continuous outcomes.
A novel drifting reward and punishment magnitude reinforcement-learning task was administered to patients with current ( = 40) and remitted depression ( = 39), and healthy volunteers ( = 40) to capture potential differences in learning behavior. Standard questionnaires were administered to measure self-reported depressive symptom severity, trait and state anxiety and level of anhedonic symptoms.
Our findings demonstrate that patients with current depression adjust their learning behaviors to a lesser degree in response to trial-by-trial variations in reward and loss magnitudes than the other groups. Computational modeling revealed that this behavioral signature of current depressive state is better accounted for by reduced reward and punishment sensitivity (all < 0.031), rather than a change in learning rate ( = 0.708). However, between-group differences were not related to self-reported symptom severity or comorbid anxiety disorders in the current depression group.
These findings suggest that current depression is associated with reduced outcome sensitivity rather than altered learning rate. Previous findings reported in this domain mainly from binary learning tasks seem to generalize to learning from continuous outcomes.
从奖励和惩罚的选择中学习在抑郁患者中受到干扰,这表明异常的强化学习可能是疾病的认知机制。然而,之前的研究对于这种行为是由学习率的改变还是对经历结果的敏感性引起的存在分歧。之前的研究通常评估对单一效价的二元结果的学习,而不是对奖励和惩罚连续结果的学习。
一项新的漂移奖励和惩罚幅度强化学习任务被用于当前(=40)和缓解(=39)抑郁患者以及健康志愿者(=40),以捕捉学习行为的潜在差异。标准问卷用于测量自我报告的抑郁症状严重程度、特质和状态焦虑以及快感缺失症状的水平。
我们的研究结果表明,当前抑郁患者在调整学习行为以适应奖励和损失幅度的逐次变化方面的程度小于其他组。计算模型表明,当前抑郁状态的这种行为特征更好地由降低的奖励和惩罚敏感性(均<0.031)解释,而不是学习率的改变(=0.708)。然而,组间差异与当前抑郁组中自我报告的症状严重程度或共病焦虑障碍无关。
这些发现表明,当前的抑郁与降低的结果敏感性有关,而不是学习率的改变。之前在这个领域的发现主要来自于二元学习任务,似乎可以推广到连续结果的学习。