在老鼠和人类中，选择历史效应可提高奖励收获效率。

Choice history effects in mice and humans improve reward harvesting efficiency.

机构信息

Aarhus University, Department of Chemistry, Aarhus, Denmark.

Aarhus University, Danish Research Institute of Translational Neuroscience (DANDRITE), Aarhus, Denmark.

出版信息

PLoS Comput Biol. 2021 Oct 4;17(10):e1009452. doi: 10.1371/journal.pcbi.1009452. eCollection 2021 Oct.

DOI:10.1371/journal.pcbi.1009452

PMID:34606493

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8516315/

Abstract

Choice history effects describe how future choices depend on the history of past choices. In experimental tasks this is typically framed as a bias because it often diminishes the experienced reward rates. However, in natural habitats, choices made in the past constrain choices that can be made in the future. For foraging animals, the probability of earning a reward in a given patch depends on the degree to which the animals have exploited the patch in the past. One problem with many experimental tasks that show choice history effects is that such tasks artificially decouple choice history from its consequences on reward availability over time. To circumvent this, we use a variable interval (VI) reward schedule that reinstates a more natural contingency between past choices and future reward availability. By examining the behavior of optimal agents in the VI task we discover that choice history effects observed in animals serve to maximize reward harvesting efficiency. We further distil the function of choice history effects by manipulating first- and second-order statistics of the environment. We find that choice history effects primarily reflect the growth rate of the reward probability of the unchosen option, whereas reward history effects primarily reflect environmental volatility. Based on observed choice history effects in animals, we develop a reinforcement learning model that explicitly incorporates choice history over multiple time scales into the decision process, and we assess its predictive adequacy in accounting for the associated behavior. We show that this new variant, known as the double trace model, has a higher performance in predicting choice data, and shows near optimal reward harvesting efficiency in simulated environments. These results suggests that choice history effects may be adaptive for natural contingencies between consumption and reward availability. This concept lends credence to a normative account of choice history effects that extends beyond its description as a bias.

摘要

选择历史效应描述了未来的选择如何取决于过去选择的历史。在实验任务中，这通常被描述为一种偏差，因为它往往会降低体验到的奖励率。然而，在自然栖息地中，过去做出的选择会限制未来可以做出的选择。对于觅食动物来说，在给定的斑块中获得奖励的概率取决于动物过去对斑块的开发程度。许多显示选择历史效应的实验任务存在的一个问题是，这些任务将选择历史与其对时间内奖励可用性的影响人为地分开。为了解决这个问题，我们使用可变间隔（VI）奖励时间表，在过去的选择和未来的奖励可用性之间恢复更自然的联系。通过研究 VI 任务中最优代理的行为，我们发现动物中观察到的选择历史效应有助于最大化奖励收获效率。我们通过操纵环境的一阶和二阶统计数据进一步提炼选择历史效应的功能。我们发现，选择历史效应主要反映了未选中选项的奖励概率的增长率，而奖励历史效应主要反映了环境的波动性。基于在动物中观察到的选择历史效应，我们开发了一种强化学习模型，该模型将多个时间尺度的选择历史明确纳入决策过程，并评估其在解释相关行为方面的预测充分性。我们表明，这种新的变体，称为双痕迹模型，在预测选择数据方面表现出更高的性能，并在模拟环境中显示出接近最优的奖励收获效率。这些结果表明，选择历史效应可能适应于消费和奖励可用性之间的自然联系。这一概念为选择历史效应的规范性解释提供了依据，超越了其作为一种偏差的描述。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fce/8516315/8f387eff2867/pcbi.1009452.g001.jpg

相似文献

Choice history effects in mice and humans improve reward harvesting efficiency.

PLoS Comput Biol. 2021 Oct 4;17(10):e1009452. doi: 10.1371/journal.pcbi.1009452. eCollection 2021 Oct.

Policy adjustment in a dynamic economic game.

PLoS One. 2006 Dec 20;1(1):e103. doi: 10.1371/journal.pone.0000103.

Normative decision rules in changing environments.

Elife. 2022 Oct 25;11:e79824. doi: 10.7554/eLife.79824.

Temporal regularities shape perceptual decisions and striatal dopamine signals.

Nat Commun. 2024 Aug 17;15(1):7093. doi: 10.1038/s41467-024-51393-8.

Entropy-based metrics for predicting choice behavior based on local response to reward.

Nat Commun. 2021 Nov 12;12(1):6567. doi: 10.1038/s41467-021-26784-w.

Model-based reinforcement learning under concurrent schedules of reinforcement in rodents.

Learn Mem. 2009 Apr 29;16(5):315-23. doi: 10.1101/lm.1295509. Print 2009 May.

Stable Representations of Decision Variables for Flexible Behavior.

Neuron. 2019 Sep 4;103(5):922-933.e7. doi: 10.1016/j.neuron.2019.06.001. Epub 2019 Jul 4.

Reward foraging task and model-based analysis reveal how fruit flies learn value of available options.

PLoS One. 2020 Oct 2;15(10):e0239616. doi: 10.1371/journal.pone.0239616. eCollection 2020.

Adaptive History Biases Result from Confidence-Weighted Accumulation of past Choices.

J Neurosci. 2018 Mar 7;38(10):2418-2429. doi: 10.1523/JNEUROSCI.2189-17.2017. Epub 2018 Jan 25.

Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making.

PLoS Comput Biol. 2021 Jun 3;17(6):e1009070. doi: 10.1371/journal.pcbi.1009070. eCollection 2021 Jun.

引用本文的文献

Stimulus uncertainty and relative reward rates determine adaptive responding in perceptual decision-making.

PLoS Comput Biol. 2025 May 27;21(5):e1012636. doi: 10.1371/journal.pcbi.1012636. eCollection 2025 May.

Sex differences in patch-leaving foraging decisions in rats.

Oxf Open Neurosci. 2023 Oct 17;2:kvad011. doi: 10.1093/oons/kvad011. eCollection 2023.

Meta-reinforcement learning via orbitofrontal cortex.

Nat Neurosci. 2023 Dec;26(12):2182-2191. doi: 10.1038/s41593-023-01485-3. Epub 2023 Nov 13.

本文引用的文献

Control over patch encounters changes foraging behavior.

iScience. 2021 Aug 20;24(9):103005. doi: 10.1016/j.isci.2021.103005. eCollection 2021 Sep 24.

Prefrontal cortex represents heuristics that shape choice bias and its integration into future behavior.

Curr Biol. 2021 Mar 22;31(6):1234-1244.e6. doi: 10.1016/j.cub.2021.01.068. Epub 2021 Feb 26.

Post-error recruitment of frontal sensory cortical projections promotes attention in mice.

Neuron. 2021 Apr 7;109(7):1202-1213.e5. doi: 10.1016/j.neuron.2021.02.001. Epub 2021 Feb 19.

Lapses in perceptual decisions reflect exploration.

Elife. 2021 Jan 11;10:e55490. doi: 10.7554/eLife.55490.

Global reward state affects learning and activity in raphe nucleus and anterior insula in monkeys.

Nat Commun. 2020 Jul 28;11(1):3771. doi: 10.1038/s41467-020-17343-w.

The impact of learning on perceptual decisions and its implication for speed-accuracy tradeoffs.

Nat Commun. 2020 Jun 2;11(1):2757. doi: 10.1038/s41467-020-16196-7.

Corticostriatal Flow of Action Selection Bias.

Neuron. 2019 Dec 18;104(6):1126-1140.e6. doi: 10.1016/j.neuron.2019.09.028. Epub 2019 Nov 6.

Stable Representations of Decision Variables for Flexible Behavior.

Neuron. 2019 Sep 4;103(5):922-933.e7. doi: 10.1016/j.neuron.2019.06.001. Epub 2019 Jul 4.

Deviation from the matching law reflects an optimal strategy involving learning over multiple timescales.

Nat Commun. 2019 Apr 1;10(1):1466. doi: 10.1038/s41467-019-09388-3.

Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection.

Comput Brain Behav. 2019;2(1):1-11. doi: 10.1007/s42113-018-0011-7. Epub 2018 Sep 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

在老鼠和人类中，选择历史效应可提高奖励收获效率。

Choice history effects in mice and humans improve reward harvesting efficiency.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

在老鼠和人类中，选择历史效应可提高奖励收获效率。

Choice history effects in mice and humans improve reward harvesting efficiency.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献