• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

VGN:用于多智能体强化学习的基于图注意力网络的价值分解

VGN: Value Decomposition With Graph Attention Networks for Multiagent Reinforcement Learning.

作者信息

Wei Qinglai, Li Yugu, Zhang Jie, Wang Fei-Yue

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Jan;35(1):182-195. doi: 10.1109/TNNLS.2022.3172572. Epub 2024 Jan 4.

DOI:10.1109/TNNLS.2022.3172572
PMID:35584069
Abstract

Although value decomposition networks and the follow on value-based studies factorizes the joint reward function to individual reward functions for a kind of cooperative multiagent reinforcement problem, in which each agent has its local observation and shares a joint reward signal, most of the previous efforts, however, ignored the graphical information between agents. In this article, a new value decomposition with graph attention network (VGN) method is developed to solve the value functions by introducing the dynamical relationships between agents. It is pointed out that the decomposition factor of an agent in our approach can be influenced by the reward signals of all the related agents and two graphical neural network-based algorithms (VGN-Linear and VGN-Nonlinear) are designed to solve the value functions of each agent. It can be proved theoretically that the present methods satisfy the factorizable condition in the centralized training process. The performance of the present methods is evaluated on the StarCraft Multiagent Challenge (SMAC) benchmark. Experiment results show that our method outperforms the state-of-the-art value-based multiagent reinforcement algorithms, especially when the tasks are with very hard level and challenging for existing methods.

摘要

尽管值分解网络以及后续基于值的研究针对一类合作多智能体强化问题,将联合奖励函数分解为个体奖励函数,其中每个智能体都有其局部观测并共享一个联合奖励信号,然而,先前的大多数工作都忽略了智能体之间的图形信息。在本文中,开发了一种带有图注意力网络(VGN)的新值分解方法,通过引入智能体之间的动态关系来求解值函数。指出在我们的方法中,一个智能体的分解因子可以受到所有相关智能体的奖励信号的影响,并且设计了两种基于图形神经网络的算法(VGN - 线性和VGN - 非线性)来求解每个智能体的值函数。从理论上可以证明,当前方法在集中训练过程中满足可分解条件。在星际争霸多智能体挑战赛(SMAC)基准上评估了当前方法的性能。实验结果表明,我们的方法优于当前基于值的多智能体强化算法,特别是当任务难度非常大且对现有方法具有挑战性时。

相似文献

1
VGN: Value Decomposition With Graph Attention Networks for Multiagent Reinforcement Learning.VGN:用于多智能体强化学习的基于图注意力网络的价值分解
IEEE Trans Neural Netw Learn Syst. 2024 Jan;35(1):182-195. doi: 10.1109/TNNLS.2022.3172572. Epub 2024 Jan 4.
2
Multiagent Reinforcement Learning With Heterogeneous Graph Attention Network.基于异构图注意力网络的多智能体强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):6851-6860. doi: 10.1109/TNNLS.2022.3215774. Epub 2023 Oct 5.
3
SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multiagent Reinforcement Learning.SMIX(λ):增强用于协作多智能体强化学习的集中式价值函数
IEEE Trans Neural Netw Learn Syst. 2023 Jan;34(1):52-63. doi: 10.1109/TNNLS.2021.3089493. Epub 2023 Jan 5.
4
TVDO: Tchebycheff Value-Decomposition Optimization for Multiagent Reinforcement Learning.TVDO:用于多智能体强化学习的切比雪夫值分解优化
IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):12521-12534. doi: 10.1109/TNNLS.2024.3455422.
5
Reinforcement Learning With Task Decomposition for Cooperative Multiagent Systems.用于协作多智能体系统的基于任务分解的强化学习
IEEE Trans Neural Netw Learn Syst. 2021 May;32(5):2054-2065. doi: 10.1109/TNNLS.2020.2996209. Epub 2021 May 3.
6
MuDE: Multi-agent decomposed reward-based exploration.MuDE:基于多代理分解奖励的探索。
Neural Netw. 2024 Nov;179:106565. doi: 10.1016/j.neunet.2024.106565. Epub 2024 Jul 22.
7
UNMAS: Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios.联合国排雷行动处:非成形合作场景下的多智能体强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Apr;34(4):2093-2104. doi: 10.1109/TNNLS.2021.3105869. Epub 2023 Apr 4.
8
Multiagent Reinforcement Learning With Graphical Mutual Information Maximization.基于图形互信息最大化的多智能体强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Feb 16;PP. doi: 10.1109/TNNLS.2023.3243557.
9
Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games.合作星际争霸游戏中的半集中式深度确定性策略梯度
IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1584-1593. doi: 10.1109/TNNLS.2020.3042943. Epub 2022 Apr 4.
10
A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning.基于深度强化学习的多智能体合作的分布视角
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4246-4259. doi: 10.1109/TNNLS.2022.3202097. Epub 2024 Feb 29.