Suppr超能文献

在可变注意力负荷下学习需要工作记忆、元学习和注意力增强的强化学习的合作。

Learning at Variable Attentional Load Requires Cooperation of Working Memory, Meta-learning, and Attention-augmented Reinforcement Learning.

机构信息

Vanderbilt University.

York University, Toronto, ON, Canada.

出版信息

J Cogn Neurosci. 2021 Dec 6;34(1):79-107. doi: 10.1162/jocn_a_01780.

Abstract

Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves using working memory of recently rewarded objects to guide choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we aim to disentangle their relative contributions in rhesus monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six monkeys is consistently best predicted with a model combining (i) fast working memory and (ii) slower reinforcement learning from differently weighted positive and negative prediction errors as well as (iii) selective suppression of nonchosen feature values and (iv) a meta-learning mechanism that enhances exploration rates based on a memory trace of recent errors. The optimal model parameter settings suggest that these mechanisms cooperate differently at low and high attentional loads. Whereas working memory was essential for efficient learning at lower attentional loads, enhanced weighting of negative prediction errors and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings pinpoint a canonical set of learning mechanisms and suggest how they may cooperate when subjects flexibly adjust to environments with variable real-world attentional demands.

摘要

灵活学习不断变化的奖励关联可以通过不同的策略来实现。一种快速学习策略涉及使用最近奖励对象的工作记忆来指导选择。较慢的学习策略则利用预测误差来逐渐更新价值期望,以改善选择。在具有真实世界刺激复杂性的场景中,这两种快速和慢速策略是如何协同工作的,目前还不太清楚。在这里,我们的目标是在恒河猴学习目标特征的相关性时,在可变注意力负荷下,解开这两种策略的相对贡献。我们发现,当猴子学习目标特征的相关性时,六种猴子的学习行为都可以通过一个综合模型来进行预测,该模型结合了(i)快速工作记忆和(ii)从不同权重的正、负预测误差中逐渐更新价值期望的较慢强化学习,以及(iii)对未选中特征值的选择性抑制和(iv)一种基于最近错误记忆痕迹的元学习机制,该机制可以根据记忆痕迹提高探索率。最优模型参数设置表明,这些机制在低注意力负荷和高注意力负荷下的合作方式不同。虽然在较低的注意力负荷下,工作记忆对高效学习至关重要,但在较高的注意力负荷下,负预测误差的加权增强和元学习对于高效学习是至关重要的。总之,这些发现确定了一组典型的学习机制,并表明它们在灵活适应具有不同真实世界注意力需求的环境时是如何合作的。

相似文献

4
Working Memory Load Strengthens Reward Prediction Errors.工作记忆负荷增强奖励预测误差。
J Neurosci. 2017 Apr 19;37(16):4332-4342. doi: 10.1523/JNEUROSCI.2700-16.2017. Epub 2017 Mar 20.
8
Anterior Cingulate Cortex Causally Supports Meta-Learning.前扣带回皮层因果性地支持元学习。
bioRxiv. 2024 Jun 13:2024.06.12.598723. doi: 10.1101/2024.06.12.598723.

引用本文的文献

4
Anterior Cingulate Cortex Causally Supports Meta-Learning.前扣带回皮层因果性地支持元学习。
bioRxiv. 2024 Jun 13:2024.06.12.598723. doi: 10.1101/2024.06.12.598723.
5
Learning attentional templates for value-based decision-making.学习基于价值的决策的注意模板。
Cell. 2024 Mar 14;187(6):1476-1489.e21. doi: 10.1016/j.cell.2024.01.041. Epub 2024 Feb 23.

本文引用的文献

2
The Role of Executive Function in Shaping Reinforcement Learning.执行功能在塑造强化学习中的作用。
Curr Opin Behav Sci. 2021 Apr;38:66-73. doi: 10.1016/j.cobeha.2020.10.003. Epub 2020 Nov 14.
7
Reward-driven distraction: A meta-analysis.奖赏驱动的分心:一项荟萃分析。
Psychol Bull. 2020 Oct;146(10):872-899. doi: 10.1037/bul0000296. Epub 2020 Jul 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验