Department of Applied Mathematics, University of Colorado Boulder, Boulder, United States.
Department of Neuroscience, University of Pennsylvania, Philadelphia, United States.
Elife. 2022 Oct 25;11:e79824. doi: 10.7554/eLife.79824.
Models based on normative principles have played a major role in our understanding of how the brain forms decisions. However, these models have typically been derived for simple, stable conditions, and their relevance to decisions formed under more naturalistic, dynamic conditions is unclear. We previously derived a normative decision model in which evidence accumulation is adapted to fluctuations in the evidence-generating process that occur during a single decision (Glaze et al., 2015), but the evolution of commitment rules (e.g. thresholds on the accumulated evidence) under dynamic conditions is not fully understood. Here, we derive a normative model for decisions based on changing contexts, which we define as changes in evidence quality or reward, over the course of a single decision. In these cases, performance (reward rate) is maximized using decision thresholds that respond to and even anticipate these changes, in contrast to the static thresholds used in many decision models. We show that these adaptive thresholds exhibit several distinct temporal motifs that depend on the specific predicted and experienced context changes and that adaptive models perform robustly even when implemented imperfectly (noisily). We further show that decision models with adaptive thresholds outperform those with constant or urgency-gated thresholds in accounting for human response times on a task with time-varying evidence quality and average reward. These results further link normative and neural decision-making while expanding our view of both as dynamic, adaptive processes that update and use expectations to govern both deliberation and commitment.
基于规范原则的模型在我们理解大脑如何做出决策方面发挥了重要作用。然而,这些模型通常是针对简单、稳定的情况推导出来的,它们与在更自然、动态的条件下形成的决策的相关性尚不清楚。我们之前推导出了一个规范性决策模型,在该模型中,证据积累会适应单个决策过程中发生的证据生成过程的波动(Glaze 等人,2015 年),但在动态条件下承诺规则(例如,对累积证据的阈值)的演变尚不完全清楚。在这里,我们推导出了一个基于变化中的上下文的规范性决策模型,我们将其定义为单个决策过程中证据质量或奖励的变化。在这些情况下,使用响应甚至预测这些变化的决策阈值可以最大化性能(奖励率),与许多决策模型中使用的静态阈值形成对比。我们表明,这些自适应阈值表现出几种不同的时间模式,这些模式取决于具体的预测和经验上下文变化,并且即使在实现不完美(嘈杂)的情况下,自适应模型也能稳健地执行。我们进一步表明,在具有时变证据质量和平均奖励的任务中,具有自适应阈值的决策模型在解释人类反应时间方面优于具有固定或紧急门限的决策模型。这些结果进一步将规范性和神经决策联系起来,同时扩展了我们对这两者都是动态、自适应过程的看法,这些过程会更新并使用期望来管理审议和承诺。