基于势场的面向生成子目标的多智能体强化学习。

Generative subgoal oriented multi-agent reinforcement learning through potential field.

机构信息

Academy of Military Science, Beijing, 100000, China.

出版信息

Neural Netw. 2024 Nov;179:106552. doi: 10.1016/j.neunet.2024.106552. Epub 2024 Jul 17.

DOI:10.1016/j.neunet.2024.106552

Abstract

Multi-agent reinforcement learning (MARL) effectively improves the learning speed of agents in sparse reward tasks with the guide of subgoals. However, existing works sever the consistency of the learning objectives of the subgoal generation and subgoal reached stages, thereby significantly inhibiting the effectiveness of subgoal learning. To address this problem, we propose a novel Potential field Subgoal-based Multi-Agent reinforcement learning (PSMA) method, which introduces the potential field (PF) to unify the two-stage learning objectives. Specifically, we design a state-to-PF representation model that describes agents' states as potential fields, allowing easy measurement of the interaction effect for both allied and enemy agents. With the PF representation, a subgoal selector is designed to automatically generate multiple subgoals for each agent, drawn from the experience replay buffer that contains both individual and total PF values. Based on the determined subgoals, we define an intrinsic reward function to guide the agent to reach their respective subgoals while maximizing the joint action-value. Experimental results show that our method outperforms the state-of-the-art MARL method on both StarCraft II micro-management (SMAC) and Google Research Football (GRF) tasks with sparse reward settings.

摘要

多智能体强化学习 (MARL) 通过子目标有效地提高了在稀疏奖励任务中智能体的学习速度。然而，现有工作切断了子目标生成和子目标达成阶段的学习目标的一致性，从而显著抑制了子目标学习的效果。为了解决这个问题，我们提出了一种新的基于势场的多智能体强化学习 (PSMA) 方法，该方法引入了势场 (PF) 来统一两个阶段的学习目标。具体来说，我们设计了一个状态到 PF 的表示模型，将智能体的状态表示为势场，允许轻松测量盟友和敌人智能体之间的相互作用效应。有了 PF 表示，我们设计了一个子目标选择器，用于从包含个体和总 PF 值的经验重放缓冲区中为每个智能体自动生成多个子目标。基于确定的子目标，我们定义了一个内在奖励函数，以指导智能体在最大化联合动作值的同时达到各自的子目标。实验结果表明，我们的方法在星际争霸 II 微观管理 (SMAC) 和 Google 研究足球 (GRF) 具有稀疏奖励设置的任务上都优于最先进的 MARL 方法。

相似文献

Generative subgoal oriented multi-agent reinforcement learning through potential field.

Neural Netw. 2024 Nov;179:106552. doi: 10.1016/j.neunet.2024.106552. Epub 2024 Jul 17.

LJIR: Learning Joint-Action Intrinsic Reward in cooperative multi-agent reinforcement learning.

Neural Netw. 2023 Oct;167:450-459. doi: 10.1016/j.neunet.2023.08.016. Epub 2023 Aug 22.

Strangeness-driven exploration in multi-agent reinforcement learning.

Neural Netw. 2024 Apr;172:106149. doi: 10.1016/j.neunet.2024.106149. Epub 2024 Jan 26.

End-to-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery.

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7778-7790. doi: 10.1109/TNNLS.2021.3087733. Epub 2022 Nov 30.

MuDE: Multi-agent decomposed reward-based exploration.

Neural Netw. 2024 Nov;179:106565. doi: 10.1016/j.neunet.2024.106565. Epub 2024 Jul 22.

Value-Based Subgoal Discovery and Path Planning for Reaching Long-Horizon Goals.

IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):10288-10300. doi: 10.1109/TNNLS.2023.3240004. Epub 2024 Aug 5.

Discovering Intrinsic Subgoals for Vision- and-Language Navigation via Hierarchical Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):6516-6528. doi: 10.1109/TNNLS.2024.3398300. Epub 2025 Apr 4.

HyperComm: Hypergraph-based communication in multi-agent reinforcement learning.

Neural Netw. 2024 Oct;178:106432. doi: 10.1016/j.neunet.2024.106432. Epub 2024 Jun 10.

Credit assignment with predictive contribution measurement in multi-agent reinforcement learning.

Neural Netw. 2023 Jul;164:681-690. doi: 10.1016/j.neunet.2023.05.021. Epub 2023 May 20.

Hierarchical Attention Master-Slave for heterogeneous multi-agent reinforcement learning.

Neural Netw. 2023 May;162:359-368. doi: 10.1016/j.neunet.2023.02.037. Epub 2023 Mar 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于势场的面向生成子目标的多智能体强化学习。

Generative subgoal oriented multi-agent reinforcement learning through potential field.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献