封闭训练与交错训练对相对价值学习的影响。

Effects of blocked versus interleaved training on relative value learning.

作者信息

Hayes William M, Wedell Douglas H

机构信息

Department of Psychology, University of South Carolina, 1512 Pendleton St, Columbia, SC, 29208, USA.

出版信息

Psychon Bull Rev. 2023 Oct;30(5):1895-1907. doi: 10.3758/s13423-023-02290-6. Epub 2023 Apr 18.

DOI:10.3758/s13423-023-02290-6

PMID:37072667

Abstract

In reinforcement learning tasks, people learn the values of options relative to other options in the local context. Prior research suggests that relative value learning is enhanced when choice contexts are temporally clustered in a blocked sequence compared to a randomly interleaved sequence. The present study was aimed at further investigating the effects of blocked versus interleaved training using a choice task that distinguishes among different contextual encoding models. Our results showed that the presentation format in which contexts are experienced can lead to qualitatively distinct forms of relative value learning. This conclusion was supported by a combination of model-free and model-based analyses. In the blocked condition, choice behavior was most consistent with a reference point model in which outcomes are encoded relative to a dynamic estimate of the contextual average reward. In contrast, the interleaved condition was best described by a range-frequency encoding model. We propose that blocked training makes it easier to track contextual outcome statistics, such as the average reward, which may then be used to relativize the values of experienced outcomes. When contexts are interleaved, range-frequency encoding may serve as a more efficient means of storing option values in memory for later retrieval.

摘要

在强化学习任务中，人们在局部情境中学习选项相对于其他选项的价值。先前的研究表明，与随机交错序列相比，当选择情境按时间顺序聚类成一个分块序列时，相对价值学习会得到增强。本研究旨在使用一种能区分不同情境编码模型的选择任务，进一步探究分块训练与交错训练的效果。我们的结果表明，体验情境的呈现格式会导致相对价值学习出现质的不同形式。这一结论得到了无模型分析和基于模型分析的共同支持。在分块条件下，选择行为最符合一个参考点模型，在该模型中，结果是相对于情境平均奖励的动态估计进行编码的。相比之下，交错条件最好用范围频率编码模型来描述。我们提出，分块训练使跟踪情境结果统计信息（如平均奖励）变得更容易，然后这些信息可用于将所体验结果的价值相对化。当情境交错时，范围频率编码可能是一种在记忆中存储选项价值以便日后检索的更有效方式。

相似文献

Effects of blocked versus interleaved training on relative value learning.

Psychon Bull Rev. 2023 Oct;30(5):1895-1907. doi: 10.3758/s13423-023-02290-6. Epub 2023 Apr 18.

Testing models of context-dependent outcome encoding in reinforcement learning.

Cognition. 2023 Jan;230:105280. doi: 10.1016/j.cognition.2022.105280. Epub 2022 Sep 12.

Frequency effects in action versus value learning.

J Exp Psychol Learn Mem Cogn. 2022 Sep;48(9):1311-1327. doi: 10.1037/xlm0000896. Epub 2021 Apr 19.

Effects of Ventral Striatum Lesions on Stimulus-Based versus Action-Based Reinforcement Learning.

J Neurosci. 2017 Jul 19;37(29):6902-6914. doi: 10.1523/JNEUROSCI.0631-17.2017. Epub 2017 Jun 16.

Generalization of value in reinforcement learning by humans.

Eur J Neurosci. 2012 Apr;35(7):1092-104. doi: 10.1111/j.1460-9568.2012.08017.x.

The functional form of value normalization in human reinforcement learning.

Elife. 2023 Jul 10;12:e83891. doi: 10.7554/eLife.83891.

Learning reward frequency over reward probability: A tale of two learning rules.

Cognition. 2019 Dec;193:104042. doi: 10.1016/j.cognition.2019.104042. Epub 2019 Aug 17.

Evidence that anterograde learning interference depends on the stage of learning of the interferer: blocked versus interleaved training.

Learn Mem. 2023 Jul 7;30(5-6):101-109. doi: 10.1101/lm.053710.122. Print 2023 May-Jun.

Optimizing Music Learning: Exploring How Blocked and Interleaved Practice Schedules Affect Advanced Performance.

Front Psychol. 2016 Aug 18;7:1251. doi: 10.3389/fpsyg.2016.01251. eCollection 2016.

Opponent Identity Influences Value Learning in Simple Games.

J Neurosci. 2015 Aug 5;35(31):11133-43. doi: 10.1523/JNEUROSCI.3530-14.2015.

引用本文的文献

The timescale and direction of influence of a third inferior alternative in human value-learning.

Commun Psychol. 2025 Apr 5;3(1):56. doi: 10.1038/s44271-025-00229-2.

Comparing experience- and description-based economic preferences across 11 countries.

Nat Hum Behav. 2024 Aug;8(8):1554-1567. doi: 10.1038/s41562-024-01894-9. Epub 2024 Jun 14.

Outcome context-dependence is not WEIRD: Comparing reinforcement- and description-based economic preferences worldwide.

Res Sq. 2023 Mar 2:rs.3.rs-2621222. doi: 10.21203/rs.3.rs-2621222/v1.

本文引用的文献

Testing models of context-dependent outcome encoding in reinforcement learning.

Cognition. 2023 Jan;230:105280. doi: 10.1016/j.cognition.2022.105280. Epub 2022 Sep 12.

Decision by sampling implements efficient coding of psychoeconomic functions.

Psychol Rev. 2018 Nov;125(6):985-1001. doi: 10.1037/rev0000123.

Partial Adaptation of Obtained and Observed Value Signals Preserves Information about Gains and Losses.

J Neurosci. 2016 Sep 28;36(39):10016-25. doi: 10.1523/JNEUROSCI.0487-16.2016.

Normalization is a general neural mechanism for context-dependent decision making.

Proc Natl Acad Sci U S A. 2013 Apr 9;110(15):6139-44. doi: 10.1073/pnas.1217854110. Epub 2013 Mar 25.

Efficient coding and the neural representation of value.

Ann N Y Acad Sci. 2012 Mar;1251:13-32. doi: 10.1111/j.1749-6632.2012.06496.x.

How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis.

Eur J Neurosci. 2012 Apr;35(7):1024-35. doi: 10.1111/j.1460-9568.2011.07980.x.

Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain.

J Neurosci. 2012 Jan 11;32(2):551-62. doi: 10.1523/JNEUROSCI.5498-10.2012.

Does the brain calculate value?

Trends Cogn Sci. 2011 Nov;15(11):546-54. doi: 10.1016/j.tics.2011.09.008. Epub 2011 Oct 7.

Anchors, scales and the relative coding of value in the brain.

Curr Opin Neurobiol. 2008 Apr;18(2):173-8. doi: 10.1016/j.conb.2008.07.010. Epub 2008 Aug 22.

Of gnomes and leprechauns: the recruitment of recent and categorical contexts in social judgment.

Acta Psychol (Amst). 2007 Jul;125(3):361-89. doi: 10.1016/j.actpsy.2006.10.004. Epub 2007 Jan 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

封闭训练与交错训练对相对价值学习的影响。

Effects of blocked versus interleaved training on relative value learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献