Suppr超能文献

不确定性和奖励历史对输赢后的决策有不同影响。

Uncertainty and reward histories have distinct effects on decisions after wins and losses.

作者信息

Kalhan Shivam, Magnard Robin, Cheng Yifeng, Janak Patricia H

机构信息

Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD.

Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD.

出版信息

bioRxiv. 2025 Aug 19:2025.08.14.670176. doi: 10.1101/2025.08.14.670176.

Abstract

Intelligent behavior necessitates an adaptive integration of feedback. It is well-known that animals asymmetrically learn from positive and negative feedback. While asymmetrical learning is a robust behavioral effect, the latent computations behind how animals represent their environments and use this to differentially weight wins and losses is poorly understood. Here we tested whether and how uncertainty and reward histories modulate the weights placed on wins and losses using a behavioral data set collected in rats. We propose a reinforcement learning model that integrates uncertainty history via an unsigned average reward prediction error and a separate subjective reward history component. We showed that in a dynamic probabilistic reversal learning task with blocks of variable reward predictability, ongoing estimation of uncertainty history and reward history both distinctly influenced rats' sensitivity to wins and losses. In more predictable environments, and under low uncertainty levels, i.e., when rats were certain in making 'correct' choices, rats weighted wins more than losses, as indicated by a higher win-stay, and lower lose-shift probability. This asymmetrical learning strategy enabled rats to remain with the correct action, while discounting the influence of rare losses. Further, male rats were more impacted by their reward history, i.e., environmental richness, when making lose-shift decisions, but conversely, female rats were more influenced by their uncertainty history. Hence, we found sex-specific contributions of these latent computations in modulating behavior. We overall demonstrate that asymmetrically weighting wins and losses could form an important behavioral strategy when adapting to ongoing changes in reward and uncertainty history.

摘要

智能行为需要对反馈进行适应性整合。众所周知,动物从正面和负面反馈中进行不对称学习。虽然不对称学习是一种强大的行为效应,但对于动物如何表征其环境并利用这一点来差异化权衡胜利和失败背后的潜在计算却知之甚少。在这里,我们使用在大鼠身上收集的行为数据集,测试了不确定性和奖励历史是否以及如何调节对胜利和失败的权重。我们提出了一种强化学习模型,该模型通过无符号平均奖励预测误差和一个单独的主观奖励历史成分来整合不确定性历史。我们表明,在一个具有可变奖励可预测性块的动态概率反转学习任务中,对不确定性历史和奖励历史的持续估计都明显影响了大鼠对胜利和失败的敏感性。在更可预测的环境中,以及在低不确定性水平下,即当大鼠确定做出“正确”选择时,大鼠对胜利的权重高于失败,这表现为更高的赢则留和更低的输则变概率。这种不对称学习策略使大鼠能够保持正确的行动,同时减少罕见损失的影响。此外,雄性大鼠在做出输则变决策时,受其奖励历史(即环境丰富度)的影响更大,但相反,雌性大鼠受其不确定性历史的影响更大。因此,我们发现这些潜在计算在调节行为方面存在性别特异性贡献。我们总体证明,在适应奖励和不确定性历史的持续变化时,不对称地权衡胜利和失败可能构成一种重要的行为策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ad1/12393270/65a32ec06e7a/nihpp-2025.08.14.670176v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验