Orzan Nicole, Acar Erman, Grossi Davide, Rădulescu Roxana
Bernoulli Institute for Mathematics, Computer Science and Artificial Intelligence, University of Groningen, 9747 AG Nijenborgh, Groningen, The Netherlands.
Institute for Logic, Language and Computation, University of Amsterdam, 1098XH Science Park, Amsterdam, The Netherlands.
Neural Comput Appl. 2025;37(23):18899-18932. doi: 10.1007/s00521-024-10530-6. Epub 2025 Jan 30.
Communication is a widely used mechanism to promote cooperation in multi-agent systems. In the field of emergent communication, agents are typically trained in specific environments: cooperative, competitive or mixed-motive. Motivated by the idea that real-world settings are characterized by incomplete information and that humans face daily interactions under a wide spectrum of incentives, we aim to explore the role of emergent communication when simultaneously exploited across all these contexts. In this work, we pursue this line of research by focusing on social dilemmas. To do this, we developed an extended version of the Public Goods Game, which allows us to train independent reinforcement learning agents simultaneously in different scenarios where incentives are (mis)aligned to various extents. Additionally, agents experience uncertainty in terms of the alignment of their incentives with those of others. We equip agents with the ability to learn a communication policy and study the impact of emergent communication in the face of uncertainty among agents. Our findings show that in settings where all agents have the same level of uncertainty, communication can enhance the cooperation of the whole group. However, in cases of asymmetric uncertainty, the agents that do not face uncertainty learn to use communication to deceive and exploit their uncertain peers.
通信是在多智能体系统中促进合作的一种广泛使用的机制。在涌现通信领域,智能体通常在特定环境中进行训练:合作型、竞争型或混合动机型。受现实世界环境以信息不完整为特征且人类在广泛的激励下面临日常互动这一观点的启发,我们旨在探讨在所有这些背景下同时利用涌现通信时它所起的作用。在这项工作中,我们通过关注社会困境来开展这一研究方向。为此,我们开发了公共物品博弈的一个扩展版本,它使我们能够在激励(在不同程度上)(不)一致的不同场景中同时训练独立的强化学习智能体。此外,智能体在自身激励与其他智能体激励的一致性方面会经历不确定性。我们赋予智能体学习通信策略的能力,并研究在智能体之间存在不确定性的情况下涌现通信的影响。我们的研究结果表明,在所有智能体具有相同不确定性水平的环境中,通信可以增强整个群体的合作。然而,在不对称不确定性的情况下,不面临不确定性的智能体会学会利用通信来欺骗和剥削其面临不确定性的同伴。