• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

广义注意力加权强化学习。

Generalized attention-weighted reinforcement learning.

机构信息

Faculty of Technology, Bielefeld University, 33615, Germany; Computational Neuroscience Labs, ATR Institute International, 619-0288, Japan.

Computational Neuroscience Labs, ATR Institute International, 619-0288, Japan.

出版信息

Neural Netw. 2022 Jan;145:10-21. doi: 10.1016/j.neunet.2021.09.023. Epub 2021 Oct 11.

DOI:10.1016/j.neunet.2021.09.023
PMID:34710787
Abstract

In neuroscience, attention has been shown to bidirectionally interact with reinforcement learning (RL) to reduce the dimensionality of task representations, restricting computations to relevant features. In machine learning, despite their popularity, attention mechanisms have seldom been administered to decision-making problems. Here, we leverage a theoretical model from computational neuroscience - the attention-weighted RL (AWRL), defining how humans identify task-relevant features (i.e., that allow value predictions) - to design an applied deep RL paradigm. We formally demonstrate that the conjunction of the self-attention mechanism, widely employed in machine learning, with value function approximation is a general formulation of the AWRL model. To evaluate our agent, we train it on three Atari tasks at different complexity levels, incorporating both task-relevant and irrelevant features. Because the model uses semantic observations, we can uncover not only which features the agent elects to base decisions on, but also how it chooses to compile more complex, relational features from simpler ones. We first show that performance depends in large part on the ability to compile new compound features, rather than mere focus on individual features. In line with neuroscience predictions, self-attention leads to high resiliency to noise (irrelevant features) compared to other benchmark models. Finally, we highlight the importance and separate contributions of both bottom-up and top-down attention in the learning process. Together, these results demonstrate the broader validity of the AWRL framework in complex task scenarios, and illustrate the benefits of a deeper integration between neuroscience-derived models and RL for decision making in machine learning.

摘要

在神经科学中,注意力已被证明与强化学习(RL)双向相互作用,以降低任务表示的维度,将计算限制在相关特征上。在机器学习中,尽管它们很受欢迎,但注意力机制很少被应用于决策问题。在这里,我们利用计算神经科学中的一个理论模型——注意力加权 RL(AWRL),定义了人类如何识别任务相关特征(即允许进行价值预测的特征)——来设计一个应用深度 RL 范例。我们正式证明,在机器学习中广泛使用的自注意力机制与价值函数逼近的结合是 AWRL 模型的一般形式。为了评估我们的代理,我们在三个不同复杂度级别的 Atari 任务上对其进行训练,同时包含相关和不相关的特征。由于模型使用语义观察,我们不仅可以发现代理选择基于哪些特征做出决策,还可以发现它如何选择从更简单的特征编译更复杂的关系特征。我们首先表明,性能在很大程度上取决于编译新复合特征的能力,而不仅仅是关注单个特征。与神经科学预测一致,与其他基准模型相比,自注意力机制对噪声(无关特征)具有更高的弹性。最后,我们强调了在学习过程中自上而下和自下而上注意力的重要性和单独贡献。总之,这些结果证明了 AWRL 框架在复杂任务场景中的更广泛有效性,并说明了从神经科学模型和 RL 为机器学习中的决策进行更深入整合的好处。

相似文献

1
Generalized attention-weighted reinforcement learning.广义注意力加权强化学习。
Neural Netw. 2022 Jan;145:10-21. doi: 10.1016/j.neunet.2021.09.023. Epub 2021 Oct 11.
2
Multiple memory systems as substrates for multiple decision systems.多种记忆系统作为多种决策系统的基础。
Neurobiol Learn Mem. 2015 Jan;117:4-13. doi: 10.1016/j.nlm.2014.04.014. Epub 2014 May 15.
3
Reinforcement learning and its connections with neuroscience and psychology.强化学习及其与神经科学和心理学的联系。
Neural Netw. 2022 Jan;145:271-287. doi: 10.1016/j.neunet.2021.10.003. Epub 2021 Oct 22.
4
Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning.选择性粒子注意:快速灵活地为深度强化学习选择特征。
Neural Netw. 2022 Jun;150:408-421. doi: 10.1016/j.neunet.2022.03.015. Epub 2022 Mar 17.
5
Self-Supervised Discovering of Interpretable Features for Reinforcement Learning.基于自监督学习的强化学习可解释特征发现。
IEEE Trans Pattern Anal Mach Intell. 2022 May;44(5):2712-2724. doi: 10.1109/TPAMI.2020.3037898. Epub 2022 Apr 1.
6
Deep Reinforcement Learning and Its Neuroscientific Implications.深度强化学习及其神经科学意义。
Neuron. 2020 Aug 19;107(4):603-616. doi: 10.1016/j.neuron.2020.06.014. Epub 2020 Jul 13.
7
Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning.基于视觉的强化学习的时空间因果解释。
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):10222-10235. doi: 10.1109/TPAMI.2021.3133717. Epub 2022 Nov 7.
8
STACoRe: Spatio-temporal and action-based contrastive representations for reinforcement learning in Atari.STACoRe:用于雅达利强化学习的基于时空和动作对比的表示方法。
Neural Netw. 2023 Mar;160:1-11. doi: 10.1016/j.neunet.2022.12.018. Epub 2022 Dec 29.
9
Computational evidence for hierarchically structured reinforcement learning in humans.人类强化学习的分层结构计算证据。
Proc Natl Acad Sci U S A. 2020 Nov 24;117(47):29381-29389. doi: 10.1073/pnas.1912330117.
10
Performance of a Computational Model of the Mammalian Olfactory System哺乳动物嗅觉系统计算模型的性能

引用本文的文献

1
A Survey on Explainable Artificial Intelligence (XAI) Techniques for Visualizing Deep Learning Models in Medical Imaging.用于医学成像中深度学习模型可视化的可解释人工智能(XAI)技术综述。
J Imaging. 2024 Sep 25;10(10):239. doi: 10.3390/jimaging10100239.
2
Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning.选择性粒子注意:快速灵活地为深度强化学习选择特征。
Neural Netw. 2022 Jun;150:408-421. doi: 10.1016/j.neunet.2022.03.015. Epub 2022 Mar 17.