• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多智能体环境中独立强化学习算法的研究

Investigation of independent reinforcement learning algorithms in multi-agent environments.

作者信息

Lee Ken Ming, Ganapathi Subramanian Sriram, Crowley Mark

机构信息

Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON, Canada.

出版信息

Front Artif Intell. 2022 Sep 20;5:805823. doi: 10.3389/frai.2022.805823. eCollection 2022.

DOI:10.3389/frai.2022.805823
PMID:36204598
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9530713/
Abstract

Independent reinforcement learning algorithms have no theoretical guarantees for finding the best policy in multi-agent settings. However, in practice, prior works have reported good performance with independent algorithms in some domains and bad performance in others. Moreover, a comprehensive study of the strengths and weaknesses of independent algorithms is lacking in the literature. In this paper, we carry out an empirical comparison of the performance of independent algorithms on seven PettingZoo environments that span the three main categories of multi-agent environments, i.e., cooperative, competitive, and mixed. For the cooperative setting, we show that independent algorithms can perform on par with multi-agent algorithms in fully-observable environments, while adding recurrence improves the learning of independent algorithms in partially-observable environments. In the competitive setting, independent algorithms can perform on par or better than multi-agent algorithms, even in more challenging environments. We also show that agents trained independent algorithms learn to perform well individually, but fail to learn to cooperate with allies and compete with enemies in mixed environments.

摘要

独立强化学习算法在多智能体环境中寻找最优策略时没有理论保证。然而,在实践中,先前的研究报告称,独立算法在某些领域表现良好,而在其他领域表现不佳。此外,文献中缺乏对独立算法优缺点的全面研究。在本文中,我们对独立算法在七个PettingZoo环境中的性能进行了实证比较,这些环境涵盖了多智能体环境的三个主要类别,即合作型、竞争型和混合型。对于合作环境,我们表明,在完全可观测的环境中,独立算法的性能可以与多智能体算法相媲美,而添加循环则可以提高独立算法在部分可观测环境中的学习能力。在竞争环境中,即使在更具挑战性的环境中,独立算法的性能也可以与多智能体算法相当或更好。我们还表明,使用独立算法训练的智能体能够学会单独表现良好,但在混合环境中无法学会与盟友合作以及与敌人竞争。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/d13f7a35e880/frai-05-805823-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/81a512b33508/frai-05-805823-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/6a33b89f54be/frai-05-805823-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/5eff86c029b0/frai-05-805823-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/7bdf6a23ec4d/frai-05-805823-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/4402618caa4a/frai-05-805823-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/0a7115001ec3/frai-05-805823-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/03c9d29c92b1/frai-05-805823-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/dfd6cecb421b/frai-05-805823-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/d13f7a35e880/frai-05-805823-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/81a512b33508/frai-05-805823-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/6a33b89f54be/frai-05-805823-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/5eff86c029b0/frai-05-805823-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/7bdf6a23ec4d/frai-05-805823-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/4402618caa4a/frai-05-805823-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/0a7115001ec3/frai-05-805823-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/03c9d29c92b1/frai-05-805823-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/dfd6cecb421b/frai-05-805823-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d58/9530713/d13f7a35e880/frai-05-805823-g0009.jpg

相似文献

1
Investigation of independent reinforcement learning algorithms in multi-agent environments.多智能体环境中独立强化学习算法的研究
Front Artif Intell. 2022 Sep 20;5:805823. doi: 10.3389/frai.2022.805823. eCollection 2022.
2
LJIR: Learning Joint-Action Intrinsic Reward in cooperative multi-agent reinforcement learning.LJIR:在合作多智能体强化学习中学习联合行动内在奖励
Neural Netw. 2023 Oct;167:450-459. doi: 10.1016/j.neunet.2023.08.016. Epub 2023 Aug 22.
3
IHG-MA: Inductive heterogeneous graph multi-agent reinforcement learning for multi-intersection traffic signal control.IHG-MA:用于多交叉口交通信号控制的归纳异质图多智能体强化学习。
Neural Netw. 2021 Jul;139:265-277. doi: 10.1016/j.neunet.2021.03.015. Epub 2021 Mar 22.
4
Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks.多智能体强化学习在合作任务中的知识重用
Entropy (Basel). 2022 Mar 28;24(4):470. doi: 10.3390/e24040470.
5
An off-policy multi-agent stochastic policy gradient algorithm for cooperative continuous control.一种用于合作连续控制的离策略多智能体随机策略梯度算法。
Neural Netw. 2024 Feb;170:610-621. doi: 10.1016/j.neunet.2023.11.046. Epub 2023 Nov 23.
6
Multi-agent reinforcement learning with approximate model learning for competitive games.多智能体强化学习与近似模型学习在竞争性游戏中的应用。
PLoS One. 2019 Sep 11;14(9):e0222215. doi: 10.1371/journal.pone.0222215. eCollection 2019.
7
Deep Multi-Critic Network for accelerating Policy Learning in multi-agent environments.深度多评论家网络加速多智能体环境中的策略学习。
Neural Netw. 2020 Aug;128:97-106. doi: 10.1016/j.neunet.2020.04.023. Epub 2020 May 4.
8
Emergent Solutions to High-Dimensional Multitask Reinforcement Learning.高维多任务强化学习的应急解决方案。
Evol Comput. 2018 Fall;26(3):347-380. doi: 10.1162/evco_a_00232. Epub 2018 Jun 22.
9
Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems.基于注意力的多智能体强化学习系统容错方法
Entropy (Basel). 2021 Aug 31;23(9):1133. doi: 10.3390/e23091133.
10
HyperComm: Hypergraph-based communication in multi-agent reinforcement learning.超通讯:多智能体强化学习中的基于超图的通讯。
Neural Netw. 2024 Oct;178:106432. doi: 10.1016/j.neunet.2024.106432. Epub 2024 Jun 10.

引用本文的文献

1
Multi-Agent Reinforcement Learning in Games: Research and Applications.游戏中的多智能体强化学习:研究与应用
Biomimetics (Basel). 2025 Jun 6;10(6):375. doi: 10.3390/biomimetics10060375.

本文引用的文献

1
Multiagent cooperation and competition with deep reinforcement learning.基于深度强化学习的多智能体合作与竞争
PLoS One. 2017 Apr 5;12(4):e0172395. doi: 10.1371/journal.pone.0172395. eCollection 2017.
2
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
3
Stochastic Games.随机博弈
Proc Natl Acad Sci U S A. 1953 Oct;39(10):1095-100. doi: 10.1073/pnas.39.10.1095.