• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力的多智能体强化学习系统容错方法

Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.

作者信息

Gu Shanzhi, Geng Mingyang, Lan Long

机构信息

College of Computer, National University of Defense Technology, Changsha 410073, China.

High Performance Computing Laboratory, National University of Defense Technology, Changsha 410073, China.

出版信息

Entropy (Basel). 2021 Aug 31;23(9):1133. doi: 10.3390/e23091133.

DOI:10.3390/e23091133
PMID:34573757
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8469175/
Abstract

The aim of multi-agent reinforcement learning systems is to provide interacting agents with the ability to collaboratively learn and adapt to the behavior of other agents. Typically, an agent receives its private observations providing a partial view of the true state of the environment. However, in realistic settings, the harsh environment might cause one or more agents to show arbitrarily faulty or malicious behavior, which may suffice to allow the current coordination mechanisms fail. In this paper, we study a practical scenario of multi-agent reinforcement learning systems considering the security issues in the presence of agents with arbitrarily faulty or malicious behavior. The previous state-of-the-art work that coped with extremely noisy environments was designed on the basis that the noise intensity in the environment was known in advance. However, when the noise intensity changes, the existing method has to adjust the configuration of the model to learn in new environments, which limits the practical applications. To overcome these difficulties, we present an Attention-based Fault-Tolerant (FT-Attn) model, which can select not only correct, but also relevant information for each agent at every time step in noisy environments. The multihead attention mechanism enables the agents to learn effective communication policies through experience concurrent with the action policies. Empirical results showed that FT-Attn beats previous state-of-the-art methods in some extremely noisy environments in both cooperative and competitive scenarios, much closer to the upper-bound performance. Furthermore, FT-Attn maintains a more general fault tolerance ability and does not rely on the prior knowledge about the noise intensity of the environment.

摘要

多智能体强化学习系统的目标是为相互作用的智能体提供协同学习和适应其他智能体行为的能力。通常,一个智能体接收其私有观测值,这些观测值提供了环境真实状态的部分视图。然而,在现实场景中,恶劣的环境可能会导致一个或多个智能体表现出任意的故障或恶意行为,这可能足以使当前的协调机制失效。在本文中,我们研究了多智能体强化学习系统在存在具有任意故障或恶意行为的智能体时的安全问题的实际场景。之前应对极端噪声环境的最先进工作是在环境噪声强度提前已知的基础上设计的。然而,当噪声强度发生变化时,现有方法必须调整模型配置以在新环境中学习,这限制了实际应用。为了克服这些困难,我们提出了一种基于注意力的容错(FT-Attn)模型,该模型在噪声环境中的每个时间步不仅可以为每个智能体选择正确的信息,还可以选择相关信息。多头注意力机制使智能体能够在学习动作策略的同时通过经验学习有效的通信策略。实证结果表明,在合作和竞争场景中的一些极端噪声环境中,FT-Attn优于先前的最先进方法,更接近上限性能。此外,FT-Attn保持了更通用的容错能力,并且不依赖于关于环境噪声强度的先验知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/42dcb82b83eb/entropy-23-01133-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/1f631cf9a5a0/entropy-23-01133-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/43a70422b0fc/entropy-23-01133-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/98cd67d04540/entropy-23-01133-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/a9418d27788e/entropy-23-01133-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/9423a8ea20d4/entropy-23-01133-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/61014046a96d/entropy-23-01133-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/f85eb6b81d56/entropy-23-01133-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/42dcb82b83eb/entropy-23-01133-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/1f631cf9a5a0/entropy-23-01133-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/43a70422b0fc/entropy-23-01133-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/98cd67d04540/entropy-23-01133-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/a9418d27788e/entropy-23-01133-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/9423a8ea20d4/entropy-23-01133-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/61014046a96d/entropy-23-01133-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/f85eb6b81d56/entropy-23-01133-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5caa/8469175/42dcb82b83eb/entropy-23-01133-g008.jpg

相似文献

1
Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.基于注意力的多智能体强化学习系统容错方法
Entropy (Basel). 2021 Aug 31;23(9):1133. doi: 10.3390/e23091133.
2
Scalable and Transferable Reinforcement Learning for Multi-Agent Mixed Cooperative-Competitive Environments Based on Hierarchical Graph Attention.基于层次图注意力的多智能体混合合作竞争环境下的可扩展可迁移强化学习
Entropy (Basel). 2022 Apr 18;24(4):563. doi: 10.3390/e24040563.
3
You Were Always on My Mind: Introducing Chef's Hat and COPPER for Personalized Reinforcement Learning.你一直在我心中:介绍厨师帽和用于个性化强化学习的COPPER。
Front Robot AI. 2021 Jul 16;8:669990. doi: 10.3389/frobt.2021.669990. eCollection 2021.
4
Investigation of independent reinforcement learning algorithms in multi-agent environments.多智能体环境中独立强化学习算法的研究
Front Artif Intell. 2022 Sep 20;5:805823. doi: 10.3389/frai.2022.805823. eCollection 2022.
5
HyperComm: Hypergraph-based communication in multi-agent reinforcement learning.超通讯:多智能体强化学习中的基于超图的通讯。
Neural Netw. 2024 Oct;178:106432. doi: 10.1016/j.neunet.2024.106432. Epub 2024 Jun 10.
6
Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks.多智能体强化学习在合作任务中的知识重用
Entropy (Basel). 2022 Mar 28;24(4):470. doi: 10.3390/e24040470.
7
Learning and stabilization of altruistic behaviors in multi-agent systems by reciprocity.通过互惠实现多智能体系统中利他行为的学习与稳定
Biol Cybern. 1998 Mar;78(3):197-205. doi: 10.1007/s004220050426.
8
Learning Attentional and Gated Communication via Curiosity.通过好奇心学习注意力和门控通信。
Comput Intell Neurosci. 2022 Apr 26;2022:2951193. doi: 10.1155/2022/2951193. eCollection 2022.
9
Differentially Private Malicious Agent Avoidance in Multiagent Advising Learning.多智能体咨询学习中的差分隐私恶意代理回避
IEEE Trans Cybern. 2020 Oct;50(10):4214-4227. doi: 10.1109/TCYB.2019.2906574. Epub 2019 Apr 11.
10
Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications: A Survey.多智能体深度强化学习在多机器人应用中的研究综述
Sensors (Basel). 2023 Mar 30;23(7):3625. doi: 10.3390/s23073625.

引用本文的文献

1
An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning.一种使用深度强化学习改进多智能体追逃博弈决策的方法。
Entropy (Basel). 2021 Oct 29;23(11):1433. doi: 10.3390/e23111433.

本文引用的文献

1
Learning to Cooperate via an Attention-Based Communication Neural Network in Decentralized Multi-Robot Exploration.在分散式多机器人探索中通过基于注意力的通信神经网络学习合作。
Entropy (Basel). 2019 Mar 19;21(3):294. doi: 10.3390/e21030294.
2
Multiagent cooperation and competition with deep reinforcement learning.基于深度强化学习的多智能体合作与竞争
PLoS One. 2017 Apr 5;12(4):e0172395. doi: 10.1371/journal.pone.0172395. eCollection 2017.