强化学习中积极性和确认偏见的计算根源。

The computational roots of positivity and confirmation biases in reinforcement learning.

机构信息

Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et Recherche Médicale, Paris, France; Département d'Études Cognitives, Ecole Normale Supérieure, Paris, France; Université de Recherche Paris Sciences et Lettres, Paris, France.

Paris School of Economics, Paris, France; LabNIC, Department of Fundamental Neurosciences, University of Geneva, Geneva, Switzerland; Swiss Center for Affective Science, Geneva, Switzerland.

出版信息

Trends Cogn Sci. 2022 Jul;26(7):607-621. doi: 10.1016/j.tics.2022.04.005. Epub 2022 May 31.

DOI:10.1016/j.tics.2022.04.005

PMID:35662490

Abstract

Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one's own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to 'high-level' belief updates. We present evidence against this account. Learning rates in reinforcement learning (RL) tasks, estimated across different contexts and species, generally present the same characteristic asymmetry, suggesting that belief and value updating processes share key computational principles and distortions. This bias generates over-optimistic expectations about the probability of making the right choices and, consequently, generates over-optimistic reward expectations. We discuss the normative and neurobiological roots of these RL biases and their position within the greater picture of behavioral decision-making theories.

摘要

人类无法客观地整合新信息

带有积极情感价值的结果和证实自身先前信念的证据会被过度重视。直到最近，关于正性偏差和确认偏差的理论和实证研究都假设它们是特定于“高级”信念更新的。我们提供了与这一观点相矛盾的证据。强化学习 (RL) 任务中的学习率在不同的情境和物种中进行估计，通常呈现出相同的特征性不对称性，这表明信念和价值更新过程共享关键的计算原则和扭曲。这种偏差会导致对正确选择的概率产生过于乐观的预期，从而导致对奖励的过高预期。我们讨论了这些 RL 偏差的规范和神经生物学根源，以及它们在更广泛的行为决策理论中的地位。

相似文献

The computational roots of positivity and confirmation biases in reinforcement learning.强化学习中积极性和确认偏见的计算根源。

Trends Cogn Sci. 2022 Jul;26(7):607-621. doi: 10.1016/j.tics.2022.04.005. Epub 2022 May 31.

A Normative Account of Confirmation Bias During Reinforcement Learning.强化学习中确认偏差的规范解释。

Neural Comput. 2022 Jan 14;34(2):307-337. doi: 10.1162/neco_a_01455.

The shadowing effect of initial expectation on learning asymmetry.初始期望对学习不对称性的遮蔽效应。

PLoS Comput Biol. 2023 Jul 24;19(7):e1010751. doi: 10.1371/journal.pcbi.1010751. eCollection 2023 Jul.

Asymmetric and adaptive reward coding via normalized reinforcement learning.通过归一化强化学习进行非对称和自适应奖励编码。

PLoS Comput Biol. 2022 Jul 21;18(7):e1010350. doi: 10.1371/journal.pcbi.1010350. eCollection 2022 Jul.

Linking confidence biases to reinforcement-learning processes.将置信偏差与强化学习过程联系起来。

Psychol Rev. 2023 Jul;130(4):1017-1043. doi: 10.1037/rev0000424. Epub 2023 May 8.

Effort Reinforces Learning.努力强化学习。

J Neurosci. 2022 Oct 5;42(40):7648-7658. doi: 10.1523/JNEUROSCI.2223-21.2022. Epub 2022 Sep 12.

Moderate confirmation bias enhances decision-making in groups of reinforcement-learning agents.适度的确认偏差会增强强化学习智能体群体中的决策能力。

PLoS Comput Biol. 2024 Sep 4;20(9):e1012404. doi: 10.1371/journal.pcbi.1012404. eCollection 2024 Sep.

Multiple memory systems as substrates for multiple decision systems.多种记忆系统作为多种决策系统的基础。

Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.

Confirmatory reinforcement learning changes with age during adolescence.确认性强化学习在青少年时期随年龄变化。

Dev Sci. 2023 May;26(3):e13330. doi: 10.1111/desc.13330. Epub 2022 Oct 27.

Neural basis of reinforcement learning and decision making.强化学习和决策的神经基础。

Annu Rev Neurosci. 2012;35:287-308. doi: 10.1146/annurev-neuro-062111-150512. Epub 2012 Mar 29.

引用本文的文献

Acute isolation is associated with increased reward seeking and reward learning in human adolescents.急性隔离与人类青少年寻求奖励和奖励学习的增加有关。

Commun Psychol. 2025 Sep 5;3(1):135. doi: 10.1038/s44271-025-00306-6.

Brain-wide representations of prior information in mouse decision-making.小鼠决策过程中先验信息的全脑表征。

Nature. 2025 Sep;645(8079):192-200. doi: 10.1038/s41586-025-09226-1. Epub 2025 Sep 3.

Uncertainty and reward histories have distinct effects on decisions after wins and losses.不确定性和奖励历史对输赢后的决策有不同影响。

bioRxiv. 2025 Aug 19:2025.08.14.670176. doi: 10.1101/2025.08.14.670176.

How working memory and reinforcement learning interact when avoiding punishment and pursuing reward concurrently.当同时避免惩罚和追求奖励时，工作记忆与强化学习是如何相互作用的。

J Exp Psychol Gen. 2025 Sep 1. doi: 10.1037/xge0001817.

Behavioral and computational signatures of reinforcement learning and confidence biases in gambling disorder.赌博障碍中强化学习和信心偏差的行为及计算特征

J Behav Addict. 2025 Jun 5;14(2):982-996. doi: 10.1556/2006.2025.00046. Print 2025 Jul 2.

Understanding learning through uncertainty and bias.通过不确定性和偏差来理解学习。

Commun Psychol. 2025 Feb 13;3(1):24. doi: 10.1038/s44271-025-00203-y.

Altered trial-to-trial responses to reward outcomes in KCNMA1 knockout mice during probabilistic learning tasks.在概率性学习任务期间，KCNMA1基因敲除小鼠对奖励结果的逐次试验反应发生改变。

Behav Brain Funct. 2024 Dec 28;20(1):36. doi: 10.1186/s12993-024-00262-x.

Contributions of Attention to Learning in Multidimensional Reward Environments.在多维奖励环境中注意力对学习的贡献。

J Neurosci. 2025 Feb 12;45(7):e2300232024. doi: 10.1523/JNEUROSCI.2300-23.2024.

Moderate confirmation bias enhances decision-making in groups of reinforcement-learning agents.适度的确认偏差会增强强化学习智能体群体中的决策能力。

PLoS Comput Biol. 2024 Sep 4;20(9):e1012404. doi: 10.1371/journal.pcbi.1012404. eCollection 2024 Sep.

Influence of surprise on reinforcement learning in younger and older adults.年轻人和老年人中惊喜对强化学习的影响。

PLoS Comput Biol. 2024 Aug 14;20(8):e1012331. doi: 10.1371/journal.pcbi.1012331. eCollection 2024 Aug.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

强化学习中积极性和确认偏见的计算根源。

The computational roots of positivity and confirmation biases in reinforcement learning.

机构信息

出版信息

人类无法客观地整合新信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献