Du Wei, Ding Shifei, Zhang Chenglong, Shi Zhongzhi
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):6851-6860. doi: 10.1109/TNNLS.2022.3215774. Epub 2023 Oct 5.
Most recent research on multiagent reinforcement learning (MARL) has explored how to deploy cooperative policies for homogeneous agents. However, realistic multiagent environments may contain heterogeneous agents that have different attributes or tasks. The heterogeneity of the agents and the diversity of relationships cause the learning of policy excessively tough. To tackle this difficulty, we present a novel method that employs a heterogeneous graph attention network to model the relationships between heterogeneous agents. The proposed method can generate an integrated feature representation for each agent by hierarchically aggregating latent feature information of neighbor agents, with the importance of the agent level and the relationship level being entirely considered. The method is agnostic to specific MARL methods and can be flexibly integrated with diverse value decomposition methods. We conduct experiments in predator-prey and StarCraft Multiagent Challenge (SMAC) environments, and the empirical results demonstrate that the performance of our method is superior to existing methods in several heterogeneous scenarios.
最近关于多智能体强化学习(MARL)的研究大多探讨了如何为同质智能体部署合作策略。然而,现实中的多智能体环境可能包含具有不同属性或任务的异质智能体。智能体的异质性和关系的多样性使得策略学习过于困难。为了解决这一难题,我们提出了一种新颖的方法,该方法采用异构图注意力网络对异质智能体之间的关系进行建模。所提出的方法可以通过分层聚合邻居智能体的潜在特征信息,为每个智能体生成一个综合特征表示,同时充分考虑智能体层面和关系层面的重要性。该方法与特定的MARL方法无关,可以灵活地与各种值分解方法集成。我们在捕食者 - 猎物和星际争霸多智能体挑战赛(SMAC)环境中进行了实验,实证结果表明,在几种异质场景下,我们的方法性能优于现有方法。