Swiss Center for Affective Science, University of Geneva.
Departement d'Etudes Cognitives, Ecole Normale Superieure, PSL Research University.
Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.
We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
我们在简单的经济任务中系统性地错误判断自己的表现。首先,我们通常高估自己做出正确选择的能力——这种偏差被称为过度自信。其次,当我们追求收益时,会比试图避免损失时更自信——我们称之为效价诱导的置信偏差。引人注目的是,尽管结果是逐次提供的,原则上可以在线重新校准置信度判断,但这两个偏差在强化学习 (RL) 环境中也存在。因此,强化学习环境中置信偏差是如何出现和维持的,这令人费解,目前仍未得到解释。为了解释这一悖论,我们提出置信偏差源于学习偏差,并使用来自多个实验的数据来检验这一假设,在这些实验中,我们在学习和转移阶段同时评估了工具性选择和置信度判断。我们的研究结果首先表明,参与者在这两个任务中的选择都可以通过具有情境依赖性学习和确认性更新的强化学习模型来最好地解释。然后我们证明,在这两个任务中引发的复杂、有偏差的置信度判断模式可以通过对所选选项的学习价值的过度加权来解释。最后,我们发现,因此,负责学习偏差的个体学习模型参数——确认性更新和结果情境依赖性,可预测个体元认知偏差。我们得出结论,建议元认知偏差源于基本有偏差的学习计算。