Vanderbilt University.
York University, Toronto, ON, Canada.
J Cogn Neurosci. 2021 Dec 6;34(1):79-107. doi: 10.1162/jocn_a_01780.
Flexible learning of changing reward contingencies can be realized with different strategies. A fast learning strategy involves using working memory of recently rewarded objects to guide choices. A slower learning strategy uses prediction errors to gradually update value expectations to improve choices. How the fast and slow strategies work together in scenarios with real-world stimulus complexity is not well known. Here, we aim to disentangle their relative contributions in rhesus monkeys while they learned the relevance of object features at variable attentional load. We found that learning behavior across six monkeys is consistently best predicted with a model combining (i) fast working memory and (ii) slower reinforcement learning from differently weighted positive and negative prediction errors as well as (iii) selective suppression of nonchosen feature values and (iv) a meta-learning mechanism that enhances exploration rates based on a memory trace of recent errors. The optimal model parameter settings suggest that these mechanisms cooperate differently at low and high attentional loads. Whereas working memory was essential for efficient learning at lower attentional loads, enhanced weighting of negative prediction errors and meta-learning were essential for efficient learning at higher attentional loads. Together, these findings pinpoint a canonical set of learning mechanisms and suggest how they may cooperate when subjects flexibly adjust to environments with variable real-world attentional demands.
灵活学习不断变化的奖励关联可以通过不同的策略来实现。一种快速学习策略涉及使用最近奖励对象的工作记忆来指导选择。较慢的学习策略则利用预测误差来逐渐更新价值期望,以改善选择。在具有真实世界刺激复杂性的场景中,这两种快速和慢速策略是如何协同工作的,目前还不太清楚。在这里,我们的目标是在恒河猴学习目标特征的相关性时,在可变注意力负荷下,解开这两种策略的相对贡献。我们发现,当猴子学习目标特征的相关性时,六种猴子的学习行为都可以通过一个综合模型来进行预测,该模型结合了(i)快速工作记忆和(ii)从不同权重的正、负预测误差中逐渐更新价值期望的较慢强化学习,以及(iii)对未选中特征值的选择性抑制和(iv)一种基于最近错误记忆痕迹的元学习机制,该机制可以根据记忆痕迹提高探索率。最优模型参数设置表明,这些机制在低注意力负荷和高注意力负荷下的合作方式不同。虽然在较低的注意力负荷下,工作记忆对高效学习至关重要,但在较高的注意力负荷下,负预测误差的加权增强和元学习对于高效学习是至关重要的。总之,这些发现确定了一组典型的学习机制,并表明它们在灵活适应具有不同真实世界注意力需求的环境时是如何合作的。