对 Biderman 等人（2023 年）的更正。

Correction to Biderman et al. (2023).

出版信息

J Exp Psychol Gen. 2023 Sep;152(9):2437. doi: 10.1037/xge0001478.

Abstract

Reports an error in "The role of memory in counterfactual valuation" by Natalie Biderman, Samuel J. Gershman and Daphna Shohamy (, 2023[Jun], Vol 152[6], 1754-1767). In this article, several corrections have been made to two equations, the text, and Figure 3. First, there was an error in two equations of the policy gradient model depicted in the Model Description section. The correction did not alter the main conclusion of the model, but it did change slightly the comparison between experimental conditions of each model depicted in the Results section. The correct Equation 3 and correct Equation 4 are present in the erratum. Additionally, the last four sentences in the first paragraph of the A Policy Gradient Model Captures the Memory-Based Inverse Decision Bias have been revised. Finally, the model predictions shown in the gray bars of Figure 3 were slightly modified. The online version of this article has been corrected. (The following abstract of the original article appeared in record 2023-72914-001.) Value-based decisions are often guided by past experience. If a choice led to a good outcome, we are more likely to repeat it. This basic idea is well-captured by reinforcement-learning models. However, open questions remain about how we assign value to options we did not choose and which we therefore never had the chance to learn about directly. One solution to this problem is proposed by policy gradient reinforcement-learning models; these do not require direct learning of value, instead optimizing choices according to a behavioral policy. For example, a logistic policy predicts that if a chosen option was rewarded, the unchosen option would be deemed less desirable. Here, we test the relevance of these models to human behavior and explore the role of memory in this phenomenon. We hypothesize that a policy may emerge from an associative memory trace formed during deliberation between choice options. In a preregistered study ( = 315) we show that people tend to invert the value of unchosen options relative to the outcome of chosen options, a phenomenon we term . The inverse decision bias is correlated with memory for the association between choice options; moreover, it is reduced when memory formation is experimentally interfered with. Finally, we present a new memory-based policy gradient model that predicts both the inverse decision bias and its dependence on memory. Our findings point to a significant role of associative memory in valuation of unchosen options and introduce a new perspective on the interaction between decision-making, memory, and counterfactual reasoning. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

摘要

报告了 Natalie Biderman、Samuel J. Gershman 和 Daphna Shohamy 发表的“反事实估值中的记忆作用”（，2023[6 月]，第 152 卷[6]，第 1754-1767 页）中的一个错误。在本文中，对模型描述部分所示的策略梯度模型中的两个方程、文本和图 3 进行了几项更正。首先，模型描述部分所示的策略梯度模型中的两个方程存在错误。该更正并没有改变模型的主要结论，但确实略微改变了结果部分所示的每个模型的实验条件之间的比较。勘误表中提供了正确的方程 3 和正确的方程 4。此外，“一种政策梯度模型捕获基于记忆的反决策偏差”这一段的第一句话的最后四句话已经修订。最后，图 3 中的灰色条显示的模型预测略有修改。本文的在线版本已更正。（原始文章的摘要如下）基于价值的决策通常受到过去经验的指导。如果一个选择导致了一个好的结果，我们更有可能重复它。强化学习模型很好地捕捉了这个基本思想。然而，关于我们如何为我们没有选择的选项赋值以及我们因此没有机会直接学习的选项，仍然存在一些问题。一个解决方案是由策略梯度强化学习模型提出的；这些模型不需要直接学习价值，而是根据行为策略来优化选择。例如，逻辑策略预测，如果选择的选项得到奖励，那么未选择的选项将被认为不太理想。在这里，我们测试了这些模型对人类行为的相关性，并探讨了记忆在这一现象中的作用。我们假设，一个策略可能是从选择选项之间的审议过程中形成的联想记忆痕迹中出现的。在一项预先注册的研究（n=315）中，我们表明，人们倾向于相对于选择选项的结果反转未选择选项的价值，我们将这种现象称为反决策偏差。反决策偏差与对选择选项之间关联的记忆有关；此外，当记忆形成受到实验干扰时，它会降低。最后，我们提出了一个新的基于记忆的策略梯度模型，该模型可以预测反决策偏差及其对记忆的依赖性。我们的研究结果表明联想记忆在未选择选项的估值中起着重要作用，并为决策、记忆和反事实推理之间的相互作用提供了新的视角。（PsycInfo 数据库记录（c）2023 APA，保留所有权利）。