Suppr超能文献

绩效导向的审议:一种适应情境的策略,其中紧迫性是机会成本。

Performance-gated deliberation: A context-adapted strategy in which urgency is opportunity cost.

机构信息

Mila, Québec AI Institute, Montréal, Canada.

Department of Computer Science & Operations Research, Université de Montréal, Montréal, Canada.

出版信息

PLoS Comput Biol. 2022 May 26;18(5):e1010080. doi: 10.1371/journal.pcbi.1010080. eCollection 2022 May.

Abstract

Finding the right amount of deliberation, between insufficient and excessive, is a hard decision making problem that depends on the value we place on our time. Average-reward, putatively encoded by tonic dopamine, serves in existing reinforcement learning theory as the opportunity cost of time, including deliberation time. Importantly, this cost can itself vary with the environmental context and is not trivial to estimate. Here, we propose how the opportunity cost of deliberation can be estimated adaptively on multiple timescales to account for non-stationary contextual factors. We use it in a simple decision-making heuristic based on average-reward reinforcement learning (AR-RL) that we call Performance-Gated Deliberation (PGD). We propose PGD as a strategy used by animals wherein deliberation cost is implemented directly as urgency, a previously characterized neural signal effectively controlling the speed of the decision-making process. We show PGD outperforms AR-RL solutions in explaining behaviour and urgency of non-human primates in a context-varying random walk prediction task and is consistent with relative performance and urgency in a context-varying random dot motion task. We make readily testable predictions for both neural activity and behaviour.

摘要

在不足和过度之间找到适当的思考量是一个艰难的决策问题,这取决于我们对时间的重视程度。在现有的强化学习理论中,平均奖励(tonic dopamine 编码)被认为是时间的机会成本,包括思考时间。重要的是,这种成本本身可以随着环境背景而变化,并且很难估计。在这里,我们提出了如何在多个时间尺度上自适应地估计思考的机会成本,以解释非平稳的上下文因素。我们将其用于一种基于平均奖励强化学习(AR-RL)的简单决策启发式方法,称为基于表现的审议(PGD)。我们提出 PGD 是动物使用的一种策略,其中审议成本直接作为紧迫性实施,这是一种以前被描述过的神经信号,有效地控制决策过程的速度。我们表明,在上下文变化的随机游走预测任务中,PGD 比 AR-RL 解决方案更好地解释了非人类灵长类动物的行为和紧迫性,并且与上下文变化的随机点运动任务中的相对表现和紧迫性一致。我们对神经活动和行为都提出了易于测试的预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1425/9176815/568d5b22701f/pcbi.1010080.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验