• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多智能体分层图注意力演员-评论家强化学习

Multi-Agent Hierarchical Graph Attention Actor-Critic Reinforcement Learning.

作者信息

Li Tongyue, Shi Dianxi, Jin Songchang, Wang Zhen, Yang Huanhuan, Chen Yang

机构信息

Academy of Military Sciences, Beijing 100097, China.

Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin 300450, China.

出版信息

Entropy (Basel). 2024 Dec 25;27(1):4. doi: 10.3390/e27010004.

DOI:10.3390/e27010004
PMID:39851624
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11764705/
Abstract

Multi-agent systems often face challenges such as elevated communication demands, intricate interactions, and difficulties in transferability. To address the issues of complex information interaction and model scalability, we propose an innovative hierarchical graph attention actor-critic reinforcement learning method. This method naturally models the interactions within a multi-agent system as a graph, employing hierarchical graph attention to capture the complex cooperative and competitive relationships among agents, thereby enhancing their adaptability to dynamic environments. Specifically, graph neural networks encode agent observations as single feature-embedding vectors, maintaining a constant dimensionality irrespective of the number of agents, which improves model scalability. Through the "inter-agent" and "inter-group" attention layers, the embedding vector of each agent is updated into an information-condensed and contextualized state representation, which extracts state-dependent relationships between agents and model interactions at both individual and group levels. We conducted experiments across several multi-agent tasks to assess our proposed method's effectiveness, stability, and scalability. Furthermore, to enhance the applicability of our method in large-scale tasks, we tested and validated its performance within a curriculum learning training framework, thereby enhancing its transferability.

摘要

多智能体系统常常面临诸如通信需求增加、交互复杂以及可转移性困难等挑战。为了解决复杂信息交互和模型可扩展性问题,我们提出了一种创新的分层图注意力演员-评论家强化学习方法。该方法自然地将多智能体系统内的交互建模为一个图,采用分层图注意力来捕捉智能体之间复杂的合作与竞争关系,从而增强它们对动态环境的适应性。具体而言,图神经网络将智能体观测编码为单个特征嵌入向量,无论智能体数量多少,其维度保持不变,这提高了模型的可扩展性。通过“智能体间”和“组间”注意力层,每个智能体的嵌入向量被更新为信息浓缩且情境化的状态表示,该表示提取了智能体之间状态依赖关系以及个体和组级别的模型交互。我们在多个多智能体任务上进行了实验,以评估我们提出的方法的有效性、稳定性和可扩展性。此外,为了提高我们的方法在大规模任务中的适用性,我们在课程学习训练框架内测试并验证了其性能,从而增强了其可转移性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/c078c30e7e06/entropy-27-00004-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/f957092d3ffc/entropy-27-00004-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/879b67649eb7/entropy-27-00004-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/864bb7712474/entropy-27-00004-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/3481a5add359/entropy-27-00004-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/e77603cd6c90/entropy-27-00004-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/19c29f54f5af/entropy-27-00004-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/ea652339760a/entropy-27-00004-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/c078c30e7e06/entropy-27-00004-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/f957092d3ffc/entropy-27-00004-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/879b67649eb7/entropy-27-00004-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/864bb7712474/entropy-27-00004-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/3481a5add359/entropy-27-00004-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/e77603cd6c90/entropy-27-00004-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/19c29f54f5af/entropy-27-00004-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/ea652339760a/entropy-27-00004-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/c078c30e7e06/entropy-27-00004-g008.jpg

相似文献

1
Multi-Agent Hierarchical Graph Attention Actor-Critic Reinforcement Learning.多智能体分层图注意力演员-评论家强化学习
Entropy (Basel). 2024 Dec 25;27(1):4. doi: 10.3390/e27010004.
2
Scalable and Transferable Reinforcement Learning for Multi-Agent Mixed Cooperative-Competitive Environments Based on Hierarchical Graph Attention.基于层次图注意力的多智能体混合合作竞争环境下的可扩展可迁移强化学习
Entropy (Basel). 2022 Apr 18;24(4):563. doi: 10.3390/e24040563.
3
IHG-MA: Inductive heterogeneous graph multi-agent reinforcement learning for multi-intersection traffic signal control.IHG-MA:用于多交叉口交通信号控制的归纳异质图多智能体强化学习。
Neural Netw. 2021 Jul;139:265-277. doi: 10.1016/j.neunet.2021.03.015. Epub 2021 Mar 22.
4
Meta attention for Off-Policy Actor-Critic.用于离策略演员-评论家的元注意力机制
Neural Netw. 2023 Jun;163:86-96. doi: 10.1016/j.neunet.2023.03.024. Epub 2023 Mar 28.
5
Graph Soft Actor-Critic Reinforcement Learning for Large-Scale Distributed Multirobot Coordination.用于大规模分布式多机器人协调的图软演员-评论家强化学习
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):665-676. doi: 10.1109/TNNLS.2023.3329530. Epub 2025 Jan 7.
6
Attention Enhanced Reinforcement Learning for Multi agent Cooperation.
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8235-8249. doi: 10.1109/TNNLS.2022.3146858. Epub 2023 Oct 27.
7
Attention-augmented multi-domain cooperative graph representation learning for molecular interaction prediction.用于分子相互作用预测的注意力增强多域协作图表示学习
Neural Netw. 2025 Jun;186:107265. doi: 10.1016/j.neunet.2025.107265. Epub 2025 Feb 19.
8
A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems.一种使用自注意力机制的优先经验回放演员-评论家算法,用于离散问题的策略优化。
PeerJ Comput Sci. 2024 Jun 28;10:e2161. doi: 10.7717/peerj-cs.2161. eCollection 2024.
9
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
10
Multi-agent reinforcement learning with approximate model learning for competitive games.多智能体强化学习与近似模型学习在竞争性游戏中的应用。
PLoS One. 2019 Sep 11;14(9):e0222215. doi: 10.1371/journal.pone.0222215. eCollection 2019.

本文引用的文献

1
Challenges and Opportunities in Deep Reinforcement Learning With Graph Neural Networks: A Comprehensive Review of Algorithms and Applications.基于图神经网络的深度强化学习中的挑战与机遇:算法与应用综述
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15051-15071. doi: 10.1109/TNNLS.2023.3283523. Epub 2024 Oct 30.
2
A Survey on Curriculum Learning.课程学习调查
IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4555-4576. doi: 10.1109/TPAMI.2021.3069908. Epub 2022 Aug 4.
3
Formation Control With Collision Avoidance Through Deep Reinforcement Learning Using Model-Guided Demonstration.
基于模型引导示范的深度强化学习实现带避碰功能的编队控制
IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2358-2372. doi: 10.1109/TNNLS.2020.3004893. Epub 2021 Jun 2.
4
Social behaviour and collective motion in plant-animal worms.动植物线虫中的社会行为与集体运动。
Proc Biol Sci. 2016 Feb 24;283(1825):20152946. doi: 10.1098/rspb.2015.2946.