估计不确定性会影响有无学习机会情况下的决策。

Estimation-uncertainty affects decisions with and without learning opportunities.

作者信息

Aberg Kristoffer C, Antle Levi, Paz Rony

机构信息

Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel.

Azrieli Institute for Brain and Neural sciences, Weizmann Institute of Science, Rehovot, Israel.

出版信息

Nat Commun. 2025 Jul 21;16(1):6706. doi: 10.1038/s41467-025-61960-2.

DOI:10.1038/s41467-025-61960-2

PMID:40691426

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12280070/

Abstract

Motivated behavior during reinforcement learning is determined by outcome expectations and their estimation-uncertainty (how frequently an option has been sampled), with the latter modulating exploration rates. However, although differences in sampling-rates are inherent to most types of reinforcement learning paradigms that confront highly rewarded options with less rewarded ones, it is unclear whether and how estimation-uncertainty lingers to affect long-term decisions without opportunities to learn or to explore. Here, we show that sampling-rates acquired during a reinforcement learning phase (with feedback) correlate with decision biases in a subsequent test phase (without feedback), independently from outcome expectations. Further, computational model-fits to behavior are improved by estimation-uncertainty, and specifically so for options with smaller sampling-rates/larger estimation-uncertainties. These results are replicated in two additional independent datasets. Our findings highlight that estimation-uncertainty is an important factor to consider when trying to understand human decision making.

摘要

强化学习过程中的动机行为由结果期望及其估计不确定性（对某个选项进行采样的频率）决定，后者会调节探索率。然而，尽管在大多数强化学习范式中，面对高奖励选项和低奖励选项时采样率存在差异是固有的，但尚不清楚估计不确定性是否以及如何在没有学习或探索机会的情况下持续影响长期决策。在这里，我们表明，在强化学习阶段（有反馈）获得的采样率与后续测试阶段（无反馈）的决策偏差相关，且与结果期望无关。此外，估计不确定性改善了行为的计算模型拟合，对于采样率较小/估计不确定性较大的选项尤其如此。这些结果在另外两个独立的数据集中得到了重复。我们的研究结果突出表明，在试图理解人类决策时，估计不确定性是一个需要考虑的重要因素。