在可变注意力负荷下学习需要工作记忆、元学习和注意力增强的强化学习的合作。

Learning at Variable Attentional Load Requires Cooperation of Working Memory, Meta-learning, and Attention-augmented Reinforcement Learning.

机构信息

Vanderbilt University.

York University, Toronto, ON, Canada.

出版信息

J Cogn Neurosci. 2021 Dec 6;34(1):79-107. doi: 10.1162/jocn_a_01780.

DOI:10.1162/jocn_a_01780

PMID:34813644

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9830786/

Abstract

Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves using working memory of recently rewarded objects to guide choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we aim to disentangle their relative contributions in rhesus monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six monkeys is consistently best predicted with a model combining (i) fast working memory and (ii) slower reinforcement learning from differently weighted positive and negative prediction errors as well as (iii) selective suppression of nonchosen feature values and (iv) a meta-learning mechanism that enhances exploration rates based on a memory trace of recent errors. The optimal model parameter settings suggest that these mechanisms cooperate differently at low and high attentional loads. Whereas working memory was essential for efficient learning at lower attentional loads, enhanced weighting of negative prediction errors and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings pinpoint a canonical set of learning mechanisms and suggest how they may cooperate when subjects flexibly adjust to environments with variable real-world attentional demands.

摘要

灵活学习不断变化的奖励关联可以通过不同的策略来实现。一种快速学习策略涉及使用最近奖励对象的工作记忆来指导选择。较慢的学习策略则利用预测误差来逐渐更新价值期望，以改善选择。在具有真实世界刺激复杂性的场景中，这两种快速和慢速策略是如何协同工作的，目前还不太清楚。在这里，我们的目标是在恒河猴学习目标特征的相关性时，在可变注意力负荷下，解开这两种策略的相对贡献。我们发现，当猴子学习目标特征的相关性时，六种猴子的学习行为都可以通过一个综合模型来进行预测，该模型结合了（i）快速工作记忆和（ii）从不同权重的正、负预测误差中逐渐更新价值期望的较慢强化学习，以及（iii）对未选中特征值的选择性抑制和（iv）一种基于最近错误记忆痕迹的元学习机制，该机制可以根据记忆痕迹提高探索率。最优模型参数设置表明，这些机制在低注意力负荷和高注意力负荷下的合作方式不同。虽然在较低的注意力负荷下，工作记忆对高效学习至关重要，但在较高的注意力负荷下，负预测误差的加权增强和元学习对于高效学习是至关重要的。总之，这些发现确定了一组典型的学习机制，并表明它们在灵活适应具有不同真实世界注意力需求的环境时是如何合作的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ca4/9830786/5dc29c0413e2/nihms-1858121-f0010.jpg

相似文献

Learning at Variable Attentional Load Requires Cooperation of Working Memory, Meta-learning, and Attention-augmented Reinforcement Learning.在可变注意力负荷下学习需要工作记忆、元学习和注意力增强的强化学习的合作。

J Cogn Neurosci. 2021 Dec 6;34(1):79-107. doi: 10.1162/jocn_a_01780.

Attention to Stimuli of Learned versus Innate Biological Value Relies on Separate Neural Systems.注意到学习到的与先天生物价值的刺激依赖于独立的神经系统。

J Neurosci. 2022 Dec 7;42(49):9242-9252. doi: 10.1523/JNEUROSCI.0925-22.2022. Epub 2022 Nov 1.

Change in the relative contributions of habit and working memory facilitates serial reversal learning expertise in rhesus monkeys.习惯和工作记忆相对贡献的变化有助于恒河猴的序列反转学习专长。

Anim Cogn. 2017 May;20(3):485-497. doi: 10.1007/s10071-017-1076-8. Epub 2017 Feb 9.

Working Memory Load Strengthens Reward Prediction Errors.工作记忆负荷增强奖励预测误差。

J Neurosci. 2017 Apr 19;37(16):4332-4342. doi: 10.1523/JNEUROSCI.2700-16.2017. Epub 2017 Mar 20.

Flexible Working Memory Through Selective Gating and Attentional Tagging.通过选择性门控和注意力标记实现灵活的工作记忆。

Neural Comput. 2021 Jan;33(1):1-40. doi: 10.1162/neco_a_01339. Epub 2020 Oct 20.

The Tortoise and the Hare: Interactions between Reinforcement Learning and Working Memory.乌龟与兔子：强化学习与工作记忆的相互作用。

J Cogn Neurosci. 2018 Oct;30(10):1422-1432. doi: 10.1162/jocn_a_01238. Epub 2018 Jan 18.

Neural Index of Reinforcement Learning Predicts Improved Stimulus-Response Retention under High Working Memory Load.神经强化学习指数预测在高工作记忆负荷下改善刺激-反应保持。

J Neurosci. 2023 Apr 26;43(17):3131-3143. doi: 10.1523/JNEUROSCI.1274-22.2023. Epub 2023 Mar 17.

Anterior Cingulate Cortex Causally Supports Meta-Learning.前扣带回皮层因果性地支持元学习。

bioRxiv. 2024 Jun 13:2024.06.12.598723. doi: 10.1101/2024.06.12.598723.

The role of the anterior cingulate cortex in choices based on reward value and reward contingency.前扣带皮层在基于奖励价值和奖励关联的选择中的作用。

Cereb Cortex. 2013 Dec;23(12):2884-98. doi: 10.1093/cercor/bhs266. Epub 2012 Sep 3.

Attentional Selection Can Be Predicted by Reinforcement Learning of Task-relevant Stimulus Features Weighted by Value-independent Stickiness.注意力选择可以通过对与任务相关的刺激特征进行强化学习来预测，这些特征由与价值无关的粘性加权。

J Cogn Neurosci. 2016 Feb;28(2):333-49. doi: 10.1162/jocn_a_00894. Epub 2015 Oct 21.

引用本文的文献

A toolbox for generating multidimensional 3D objects with fine-controlled feature space: Quaddle 2.0.用于生成具有精细控制特征空间的多维3D对象的工具箱：Quaddle 2.0。

Behav Res Methods. 2025 Jul 3;57(8):219. doi: 10.3758/s13428-025-02736-w.

A Toolbox for Generating Multidimensional 3-D Objects with Fine-Controlled Feature Space: Quaddle 2.0.用于生成具有精细可控特征空间的多维3D对象的工具箱：Quaddle 2.0

bioRxiv. 2024 Dec 20:2024.12.19.629479. doi: 10.1101/2024.12.19.629479.

Noradrenergic alpha-2a receptor stimulation enhances prediction error signaling and updating of attention sets in anterior cingulate cortex and striatum.去甲肾上腺素能α-2a 受体刺激增强了前扣带回皮层和纹状体中预测误差信号和注意力集的更新。

Nat Commun. 2024 Nov 15;15(1):9905. doi: 10.1038/s41467-024-54395-8.

Anterior Cingulate Cortex Causally Supports Meta-Learning.前扣带回皮层因果性地支持元学习。

bioRxiv. 2024 Jun 13:2024.06.12.598723. doi: 10.1101/2024.06.12.598723.

Learning attentional templates for value-based decision-making.学习基于价值的决策的注意模板。

Cell. 2024 Mar 14;187(6):1476-1489.e21. doi: 10.1016/j.cell.2024.01.041. Epub 2024 Feb 23.

Noradrenergic alpha-2a Receptor Stimulation Enhances Prediction Error Signaling in Anterior Cingulate Cortex and Striatum.去甲肾上腺素能α-2a受体刺激增强前扣带回皮层和纹状体中的预测误差信号。

bioRxiv. 2023 Oct 25:2023.10.25.564052. doi: 10.1101/2023.10.25.564052.

M-selective muscarinic allosteric modulation enhances cognitive flexibility and effective salience in nonhuman primates.M 型选择性毒蕈碱型乙酰胆碱能变构调节剂增强非人类灵长类动物的认知灵活性和有效突显。

Proc Natl Acad Sci U S A. 2023 May 2;120(18):e2216792120. doi: 10.1073/pnas.2216792120. Epub 2023 Apr 27.

Mechanisms of adjustments to different types of uncertainty in the reward environment across mice and monkeys.在老鼠和猴子中，对奖励环境中不同类型不确定性的调整机制。

Cogn Affect Behav Neurosci. 2023 Jun;23(3):600-619. doi: 10.3758/s13415-022-01059-z. Epub 2023 Feb 23.

Anterior cingulate cortex causally supports flexible learning under motivationally challenging and cognitively demanding conditions.扣带前回皮层在动机挑战性和认知要求高的条件下对灵活学习起因果作用支持。

PLoS Biol. 2022 Sep 6;20(9):e3001785. doi: 10.1371/journal.pbio.3001785. eCollection 2022 Sep.

Gains and Losses Affect Learning Differentially at Low and High Attentional Load.收益和损失在低和高注意力负载下对学习的影响不同。

J Cogn Neurosci. 2022 Sep 1;34(10):1952-1971. doi: 10.1162/jocn_a_01885.

本文引用的文献

Gains and Losses Affect Learning Differentially at Low and High Attentional Load.收益和损失在低和高注意力负载下对学习的影响不同。

J Cogn Neurosci. 2022 Sep 1;34(10):1952-1971. doi: 10.1162/jocn_a_01885.

The Role of Executive Function in Shaping Reinforcement Learning.执行功能在塑造强化学习中的作用。

Curr Opin Behav Sci. 2021 Apr;38:66-73. doi: 10.1016/j.cobeha.2020.10.003. Epub 2020 Nov 14.

A Kiosk Station for the Assessment of Multiple Cognitive Domains and Cognitive Enrichment of Monkeys.用于评估猴子多种认知领域及认知强化的自助服务站。

Front Behav Neurosci. 2021 Aug 26;15:721069. doi: 10.3389/fnbeh.2021.721069. eCollection 2021.

MAD saccade: statistically robust saccade threshold estimation via the median absolute deviation.MAD扫视：通过中位数绝对偏差进行统计稳健的扫视阈值估计。

J Eye Mov Res. 2020 May 12;12(8). doi: 10.16910/jemr.12.8.3.

Interactions of Medial and Lateral Prefrontal Cortex in Hierarchical Predictive Coding.内侧前额叶皮质与外侧前额叶皮质在分层预测编码中的相互作用

Front Comput Neurosci. 2021 Feb 3;15:605271. doi: 10.3389/fncom.2021.605271. eCollection 2021.

Modeling the influence of working memory, reinforcement, and action uncertainty on reaction time and choice during instrumental learning.建立模型，分析工作记忆、强化、动作不确定性对工具性学习过程中反应时和选择的影响。

Psychon Bull Rev. 2021 Feb;28(1):20-39. doi: 10.3758/s13423-020-01774-z.

Reward-driven distraction: A meta-analysis.奖赏驱动的分心：一项荟萃分析。

Psychol Bull. 2020 Oct;146(10):872-899. doi: 10.1037/bul0000296. Epub 2020 Jul 20.

Dissociable neural correlates of uncertainty underlie different exploration strategies.不同探索策略的不确定性基础上存在可分离的神经关联。

Nat Commun. 2020 May 12;11(1):2371. doi: 10.1038/s41467-020-15766-z.

Ten simple rules for the computational modeling of behavioral data.计算行为数据建模的 10 个简单规则。

Elife. 2019 Nov 26;8:e49547. doi: 10.7554/eLife.49547.

The Meaning of Behavior: Discriminating Reflex and Volition in the Brain.行为的意义：大脑中的反射与意志辨别。

Neuron. 2019 Oct 9;104(1):47-62. doi: 10.1016/j.neuron.2019.09.024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验