人类强化学习中的选择确认偏差与渐进式固执

Choice-confirmation bias and gradual perseveration in human reinforcement learning.

作者信息

Palminteri Stefano

机构信息

Laboratoire de Neurosciences Cognitives et Computationnelles, Departement d'Etudes Cognitives, Ecole Normale Superieure, Paris Sciences et Lettres Research University.

出版信息

Behav Neurosci. 2023 Feb;137(1):78-88. doi: 10.1037/bne0000541. Epub 2022 Nov 17.

DOI:10.1037/bne0000541

PMID:36395020

Abstract

Do we preferentially learn from outcomes that confirm our choices? In recent years, we investigated this question in a series of studies implementing increasingly complex behavioral protocols. The learning rates fitted in experiments featuring partial or complete feedback, as well as free and forced choices, were systematically found to be consistent with a choice-confirmation bias. One of the prominent behavioral consequences of the confirmatory learning rate pattern is choice hysteresis: that is, the tendency of repeating previous choices, despite contradictory evidence. However, choice-confirmatory pattern of learning rates may spuriously arise from not taking into consideration an explicit choice (gradual) perseveration term in the model. In the present study, we reanalyze data from four published papers (nine experiments; 363 subjects; 126,192 trials), originally included in the studies demonstrating or criticizing the choice-confirmation bias in human participants. We fitted two models: one featured valence-specific updates (i.e., different learning rates for confirmatory and disconfirmatory outcomes) and one additionally including gradual perseveration. Our analysis confirms that the inclusion of the gradual perseveration process in the model significantly reduces the estimated choice-confirmation bias. However, in all considered experiments, the choice-confirmation bias remains present at the meta-analytical level, and significantly different from zero in most experiments. Our results demonstrate that the choice-confirmation bias resists the inclusion of a gradual perseveration term, thus proving to be a robust feature of human reinforcement learning. We conclude by pointing to additional computational processes that may play an important role in estimating and interpreting the computational biases under scrutiny. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

摘要

我们是否更倾向于从证实我们选择的结果中学习？近年来，我们在一系列实施越来越复杂行为协议的研究中探讨了这个问题。系统地发现，在具有部分或完全反馈以及自由和强制选择的实验中拟合的学习率与选择确认偏差一致。确认性学习率模式的一个突出行为后果是选择滞后：也就是说，尽管有矛盾的证据，仍倾向于重复先前的选择。然而，学习率的选择确认模式可能是由于在模型中没有考虑明确的选择（逐渐的）坚持项而虚假产生的。在本研究中，我们重新分析了四篇已发表论文（九个实验；363名受试者；126,192次试验）的数据，这些数据最初包含在证明或批评人类参与者选择确认偏差的研究中。我们拟合了两个模型：一个具有效价特定更新（即确认性和非确认性结果的不同学习率），另一个额外包括逐渐坚持。我们的分析证实，在模型中纳入逐渐坚持过程显著降低了估计的选择确认偏差。然而，在所有考虑的实验中，选择确认偏差在元分析水平上仍然存在，并且在大多数实验中显著不同于零。我们的结果表明，选择确认偏差不受逐渐坚持项的影响，因此被证明是人类强化学习的一个稳健特征。我们最后指出了可能在估计和解释正在审查的计算偏差中发挥重要作用的其他计算过程。（PsycInfo数据库记录（c）2023美国心理学会，保留所有权利）

相似文献

Choice-confirmation bias and gradual perseveration in human reinforcement learning.

Behav Neurosci. 2023 Feb;137(1):78-88. doi: 10.1037/bne0000541. Epub 2022 Nov 17.

Linking confidence biases to reinforcement-learning processes.

Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing.

PLoS Comput Biol. 2017 Aug 11;13(8):e1005684. doi: 10.1371/journal.pcbi.1005684. eCollection 2017 Aug.

Confirmatory reinforcement learning changes with age during adolescence.

Dev Sci. 2023 May;26(3):e13330. doi: 10.1111/desc.13330. Epub 2022 Oct 27.

Robust valence-induced biases on motor response and confidence in human reinforcement learning.

Cogn Affect Behav Neurosci. 2020 Dec;20(6):1184-1199. doi: 10.3758/s13415-020-00826-0.

The computational roots of positivity and confirmation biases in reinforcement learning.

Trends Cogn Sci. 2022 Jul;26(7):607-621. doi: 10.1016/j.tics.2022.04.005. Epub 2022 May 31.

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.

PLoS Comput Biol. 2024 Mar 29;20(3):e1011950. doi: 10.1371/journal.pcbi.1011950. eCollection 2024 Mar.

Computational analysis of probabilistic reversal learning deficits in male subjects with alcohol use disorder.

Front Psychiatry. 2022 Oct 19;13:960238. doi: 10.3389/fpsyt.2022.960238. eCollection 2022.

A Normative Account of Confirmation Bias During Reinforcement Learning.

Neural Comput. 2022 Jan 14;34(2):307-337. doi: 10.1162/neco_a_01455.

引用本文的文献

How working memory and reinforcement learning interact when avoiding punishment and pursuing reward concurrently.

J Exp Psychol Gen. 2025 Sep 1. doi: 10.1037/xge0001817.

Data-driven equation discovery reveals nonlinear reinforcement learning in humans.

Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2413441122. doi: 10.1073/pnas.2413441122. Epub 2025 Jul 31.

Humans forage for reward in reinforcement learning tasks.

bioRxiv. 2025 Mar 7:2024.07.08.602539. doi: 10.1101/2024.07.08.602539.

Uncertainty of treatment efficacy moderates placebo effects on reinforcement learning.

Sci Rep. 2024 Jun 22;14(1):14421. doi: 10.1038/s41598-024-64240-z.

Prediction-error-dependent processing of immediate and delayed positive feedback.

Sci Rep. 2024 Apr 27;14(1):9674. doi: 10.1038/s41598-024-60328-8.

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.

PLoS Comput Biol. 2024 Mar 29;20(3):e1011950. doi: 10.1371/journal.pcbi.1011950. eCollection 2024 Mar.

Uncovering the potential of evaluative conditioning in shaping attitudes toward sustainable product packaging.

Front Psychol. 2024 Mar 14;15:1284422. doi: 10.3389/fpsyg.2024.1284422. eCollection 2024.

Individuals with problem gambling and obsessive-compulsive disorder learn through distinct reinforcement mechanisms.

PLoS Biol. 2023 Mar 14;21(3):e3002031. doi: 10.1371/journal.pbio.3002031. eCollection 2023 Mar.

Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T.

Hum Brain Mapp. 2022 Oct 15;43(15):4750-4790. doi: 10.1002/hbm.25988. Epub 2022 Jul 21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

人类强化学习中的选择确认偏差与渐进式固执

Choice-confirmation bias and gradual perseveration in human reinforcement learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献