将置信偏差与强化学习过程联系起来。

Linking confidence biases to reinforcement-learning processes.

机构信息

Swiss Center for Affective Science, University of Geneva.

Departement d'Etudes Cognitives, Ecole Normale Superieure, PSL Research University.

出版信息

Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.

DOI:10.1037/rev0000424

PMID:37155268

Abstract

We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

摘要

我们在简单的经济任务中系统性地错误判断自己的表现。首先，我们通常高估自己做出正确选择的能力——这种偏差被称为过度自信。其次，当我们追求收益时，会比试图避免损失时更自信——我们称之为效价诱导的置信偏差。引人注目的是，尽管结果是逐次提供的，原则上可以在线重新校准置信度判断，但这两个偏差在强化学习 (RL) 环境中也存在。因此，强化学习环境中置信偏差是如何出现和维持的，这令人费解，目前仍未得到解释。为了解释这一悖论，我们提出置信偏差源于学习偏差，并使用来自多个实验的数据来检验这一假设，在这些实验中，我们在学习和转移阶段同时评估了工具性选择和置信度判断。我们的研究结果首先表明，参与者在这两个任务中的选择都可以通过具有情境依赖性学习和确认性更新的强化学习模型来最好地解释。然后我们证明，在这两个任务中引发的复杂、有偏差的置信度判断模式可以通过对所选选项的学习价值的过度加权来解释。最后，我们发现，因此，负责学习偏差的个体学习模型参数——确认性更新和结果情境依赖性，可预测个体元认知偏差。我们得出结论，建议元认知偏差源于基本有偏差的学习计算。

相似文献

Linking confidence biases to reinforcement-learning processes.

Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.

Robust valence-induced biases on motor response and confidence in human reinforcement learning.

Cogn Affect Behav Neurosci. 2020 Dec;20(6):1184-1199. doi: 10.3758/s13415-020-00826-0.

Contextual influence on confidence judgments in human reinforcement learning.

PLoS Comput Biol. 2019 Apr 8;15(4):e1006973. doi: 10.1371/journal.pcbi.1006973. eCollection 2019 Apr.

Reverse engineering of metacognition.

Elife. 2022 Sep 15;11:e75420. doi: 10.7554/eLife.75420.

Choice-confirmation bias and gradual perseveration in human reinforcement learning.

Behav Neurosci. 2023 Feb;137(1):78-88. doi: 10.1037/bne0000541. Epub 2022 Nov 17.

Neural and computational underpinnings of biased confidence in human reinforcement learning.

Nat Commun. 2023 Oct 28;14(1):6896. doi: 10.1038/s41467-023-42589-5.

The computational roots of positivity and confirmation biases in reinforcement learning.

Trends Cogn Sci. 2022 Jul;26(7):607-621. doi: 10.1016/j.tics.2022.04.005. Epub 2022 May 31.

Metacognitive computations for information search: Confidence in control.

Psychol Rev. 2023 Apr;130(3):604-639. doi: 10.1037/rev0000401. Epub 2023 Feb 9.

Metacognitive hindsight bias.

Mem Cognit. 2020 Jul;48(5):731-744. doi: 10.3758/s13421-020-01012-w.

Metacognitive judgments during visuomotor learning reflect the integration of error history.

J Neurophysiol. 2023 Aug 1;130(2):264-277. doi: 10.1152/jn.00022.2023. Epub 2023 Jun 28.

引用本文的文献

Feature-based reward learning shapes human social learning strategies.

Nat Hum Behav. 2025 Jul 23. doi: 10.1038/s41562-025-02269-4.

Behavioral and computational signatures of reinforcement learning and confidence biases in gambling disorder.

J Behav Addict. 2025 Jun 5;14(2):982-996. doi: 10.1556/2006.2025.00046. Print 2025 Jul 2.

Distorted learning from local metacognition supports transdiagnostic underconfidence.

Nat Commun. 2025 Feb 21;16(1):1854. doi: 10.1038/s41467-025-57040-0.

Touch-driven advantages in reaction time but not in performance in a cross-sensory comparison of reinforcement learning.

Heliyon. 2024 Dec 20;11(1):e41330. doi: 10.1016/j.heliyon.2024.e41330. eCollection 2025 Jan 15.

Time is Confidence: Monetary Incentives Metacognitive Profile on Duration Judgment.

J Cogn. 2025 Jan 6;8(1):8. doi: 10.5334/joc.414. eCollection 2025.

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.

PLoS Comput Biol. 2024 Mar 29;20(3):e1011950. doi: 10.1371/journal.pcbi.1011950. eCollection 2024 Mar.

Learning and metacognition under volatility in GD: Lower learning rates and distorted coupling between action and confidence.

J Behav Addict. 2024 Feb 9;13(1):226-235. doi: 10.1556/2006.2023.00082. Print 2024 Mar 26.

Neural and computational underpinnings of biased confidence in human reinforcement learning.

Nat Commun. 2023 Oct 28;14(1):6896. doi: 10.1038/s41467-023-42589-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

将置信偏差与强化学习过程联系起来。

Linking confidence biases to reinforcement-learning processes.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献