Zhao Zhuoya, Zhao Feifei, Zhao Yuxuan, Zeng Yi, Sun Yinqian
Brain-inspired Cognitive Intelligence Lab, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China.
Patterns (N Y). 2023 Jun 23;4(8):100775. doi: 10.1016/j.patter.2023.100775. eCollection 2023 Aug 11.
During dynamic social interaction, inferring and predicting others' behaviors through theory of mind (ToM) is crucial for obtaining benefits in cooperative and competitive tasks. Current multi-agent reinforcement learning (MARL) methods primarily rely on agent observations to select behaviors, but they lack inspiration from ToM, which limits performance. In this article, we propose a multi-agent ToM decision-making (MAToM-DM) model, which consists of a MAToM spiking neural network (MAToM-SNN) module and a decision-making module. We design two brain-inspired ToM modules (Self-MAToM and Other-MAToM) to predict others' behaviors based on self-experience and observations of others, respectively. Each agent can adjust its behavior according to the predicted actions of others. The effectiveness of the proposed model has been demonstrated through experiments conducted in cooperative and competitive tasks. The results indicate that integrating the ToM mechanism can enhance cooperation and competition efficiency and lead to higher rewards compared with traditional MARL models.
在动态社交互动中,通过心理理论(ToM)推断和预测他人行为对于在合作和竞争任务中获取利益至关重要。当前的多智能体强化学习(MARL)方法主要依靠智能体观察来选择行为,但它们缺乏来自心理理论的启发,这限制了性能。在本文中,我们提出了一种多智能体心理理论决策(MAToM-DM)模型,该模型由一个MAToM脉冲神经网络(MAToM-SNN)模块和一个决策模块组成。我们设计了两个受大脑启发的心理理论模块(自我MAToM和他人MAToM),分别基于自我经验和对他人的观察来预测他人行为。每个智能体可以根据对他人预测的行动来调整自己的行为。通过在合作和竞争任务中进行的实验,证明了所提出模型的有效性。结果表明,与传统的MARL模型相比,整合心理理论机制可以提高合作和竞争效率,并带来更高的奖励。