Suppr超能文献

广义注意力加权强化学习。

Generalized attention-weighted reinforcement learning.

机构信息

Faculty of Technology, Bielefeld University, 33615, Germany; Computational Neuroscience Labs, ATR Institute International, 619-0288, Japan.

Computational Neuroscience Labs, ATR Institute International, 619-0288, Japan.

出版信息

Neural Netw. 2022 Jan;145:10-21. doi: 10.1016/j.neunet.2021.09.023. Epub 2021 Oct 11.

Abstract

In neuroscience, attention has been shown to bidirectionally interact with reinforcement learning (RL) to reduce the dimensionality of task representations, restricting computations to relevant features. In machine learning, despite their popularity, attention mechanisms have seldom been administered to decision-making problems. Here, we leverage a theoretical model from computational neuroscience - the attention-weighted RL (AWRL), defining how humans identify task-relevant features (i.e., that allow value predictions) - to design an applied deep RL paradigm. We formally demonstrate that the conjunction of the self-attention mechanism, widely employed in machine learning, with value function approximation is a general formulation of the AWRL model. To evaluate our agent, we train it on three Atari tasks at different complexity levels, incorporating both task-relevant and irrelevant features. Because the model uses semantic observations, we can uncover not only which features the agent elects to base decisions on, but also how it chooses to compile more complex, relational features from simpler ones. We first show that performance depends in large part on the ability to compile new compound features, rather than mere focus on individual features. In line with neuroscience predictions, self-attention leads to high resiliency to noise (irrelevant features) compared to other benchmark models. Finally, we highlight the importance and separate contributions of both bottom-up and top-down attention in the learning process. Together, these results demonstrate the broader validity of the AWRL framework in complex task scenarios, and illustrate the benefits of a deeper integration between neuroscience-derived models and RL for decision making in machine learning.

摘要

在神经科学中,注意力已被证明与强化学习(RL)双向相互作用,以降低任务表示的维度,将计算限制在相关特征上。在机器学习中,尽管它们很受欢迎,但注意力机制很少被应用于决策问题。在这里,我们利用计算神经科学中的一个理论模型——注意力加权 RL(AWRL),定义了人类如何识别任务相关特征(即允许进行价值预测的特征)——来设计一个应用深度 RL 范例。我们正式证明,在机器学习中广泛使用的自注意力机制与价值函数逼近的结合是 AWRL 模型的一般形式。为了评估我们的代理,我们在三个不同复杂度级别的 Atari 任务上对其进行训练,同时包含相关和不相关的特征。由于模型使用语义观察,我们不仅可以发现代理选择基于哪些特征做出决策,还可以发现它如何选择从更简单的特征编译更复杂的关系特征。我们首先表明,性能在很大程度上取决于编译新复合特征的能力,而不仅仅是关注单个特征。与神经科学预测一致,与其他基准模型相比,自注意力机制对噪声(无关特征)具有更高的弹性。最后,我们强调了在学习过程中自上而下和自下而上注意力的重要性和单独贡献。总之,这些结果证明了 AWRL 框架在复杂任务场景中的更广泛有效性,并说明了从神经科学模型和 RL 为机器学习中的决策进行更深入整合的好处。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验