Suppr超能文献

多智能体分层图注意力演员-评论家强化学习

Multi-Agent Hierarchical Graph Attention Actor-Critic Reinforcement Learning.

作者信息

Li Tongyue, Shi Dianxi, Jin Songchang, Wang Zhen, Yang Huanhuan, Chen Yang

机构信息

Academy of Military Sciences, Beijing 100097, China.

Tianjin Artificial Intelligence Innovation Center (TAIIC), Tianjin 300450, China.

出版信息

Entropy (Basel). 2024 Dec 25;27(1):4. doi: 10.3390/e27010004.

Abstract

Multi-agent systems often face challenges such as elevated communication demands, intricate interactions, and difficulties in transferability. To address the issues of complex information interaction and model scalability, we propose an innovative hierarchical graph attention actor-critic reinforcement learning method. This method naturally models the interactions within a multi-agent system as a graph, employing hierarchical graph attention to capture the complex cooperative and competitive relationships among agents, thereby enhancing their adaptability to dynamic environments. Specifically, graph neural networks encode agent observations as single feature-embedding vectors, maintaining a constant dimensionality irrespective of the number of agents, which improves model scalability. Through the "inter-agent" and "inter-group" attention layers, the embedding vector of each agent is updated into an information-condensed and contextualized state representation, which extracts state-dependent relationships between agents and model interactions at both individual and group levels. We conducted experiments across several multi-agent tasks to assess our proposed method's effectiveness, stability, and scalability. Furthermore, to enhance the applicability of our method in large-scale tasks, we tested and validated its performance within a curriculum learning training framework, thereby enhancing its transferability.

摘要

多智能体系统常常面临诸如通信需求增加、交互复杂以及可转移性困难等挑战。为了解决复杂信息交互和模型可扩展性问题,我们提出了一种创新的分层图注意力演员-评论家强化学习方法。该方法自然地将多智能体系统内的交互建模为一个图,采用分层图注意力来捕捉智能体之间复杂的合作与竞争关系,从而增强它们对动态环境的适应性。具体而言,图神经网络将智能体观测编码为单个特征嵌入向量,无论智能体数量多少,其维度保持不变,这提高了模型的可扩展性。通过“智能体间”和“组间”注意力层,每个智能体的嵌入向量被更新为信息浓缩且情境化的状态表示,该表示提取了智能体之间状态依赖关系以及个体和组级别的模型交互。我们在多个多智能体任务上进行了实验,以评估我们提出的方法的有效性、稳定性和可扩展性。此外,为了提高我们的方法在大规模任务中的适用性,我们在课程学习训练框架内测试并验证了其性能,从而增强了其可转移性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f15/11764705/f957092d3ffc/entropy-27-00004-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验