Jin Kai, Tang Pingzhong, Chen Shiteng, Peng Jianqing
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology (HKUST), Hong Kong, China.
Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
Front Neurorobot. 2021 May 20;15:674949. doi: 10.3389/fnbot.2021.674949. eCollection 2021.
In recent years, it is a trend to integrate the ideas in game theory into the research of multi-robot system. In this paper, a team-competition model is proposed to solve a dynamic multi-robot task allocation problem. The allocation problem asks how to assign tasks to robots such that the most suitable robot is selected to execute the most appropriate task, which arises in many real-life applications. To be specific, we study multi-round team competitions between two teams, where each team selects one of its players simultaneously in each round and each player can play at most once, which defines an extensive-form game with perfect recall. We also study a common variant where one team always selects its player before the other team in each round. Regarding the robots as the players in the first team and the tasks as the players in the second team, the sub-game perfect strategy of the first team computed via solving the team competition gives us a solution for allocating the tasks to the robots-it specifies how to select the robot (according to some probability distribution if the two teams move simultaneously) to execute the upcoming task in each round, based on the results of the matches in the previous rounds. Throughout this paper, many properties of the sub-game perfect equilibria of the team competition game are proved. We first show that uniformly random strategy is a sub-game perfect equilibrium strategy for both teams when there are no redundant players. Secondly, a team can safely abandon its weak players if it has redundant players and the strength of players is transitive. We then focus on the more interesting case where there are redundant players and the strength of players is not transitive. In this case, we obtain several counterintuitive results. For example, a player might help improve the payoff of its team, even if it is dominated by the entire other team. We also study the extent to which the dominated players can increase the payoff. Very similar results hold for the aforementioned variant where the two teams take actions in turn.
近年来,将博弈论思想融入多机器人系统研究成为一种趋势。本文提出了一种团队竞赛模型来解决动态多机器人任务分配问题。该分配问题是指如何将任务分配给机器人,以便选择最合适的机器人执行最恰当的任务,这在许多实际应用中都会出现。具体而言,我们研究两个团队之间的多轮团队竞赛,其中每个团队在每一轮同时选择一名队员,且每名队员最多只能参赛一次,这定义了一个具有完美记忆的扩展型博弈。我们还研究了一种常见变体,即每一轮中一个团队总是在另一个团队之前选择其队员。将机器人视为第一个团队的队员,任务视为第二个团队的队员,通过求解团队竞赛计算出的第一个团队的子博弈完美策略为我们提供了一种将任务分配给机器人的解决方案——它指定了如何根据前几轮比赛的结果,在每一轮中选择机器人(如果两个团队同时行动,则根据某种概率分布)来执行即将到来的任务。在本文中,证明了团队竞赛博弈的子博弈完美均衡的许多性质。我们首先表明,当没有冗余队员时,均匀随机策略是两个团队的子博弈完美均衡策略。其次,如果一个团队有冗余队员且队员实力具有传递性,那么该团队可以安全地放弃其较弱的队员。然后我们关注更有趣的情况,即存在冗余队员且队员实力不具有传递性。在这种情况下,我们得到了几个违反直觉的结果。例如,一名队员可能会帮助提高其团队的收益,即使它被整个其他团队所压制。我们还研究了被压制的队员能够提高收益的程度。对于上述两个团队依次行动的变体,也有非常相似的结果。