Kaminka Gal A
Department of Computer Science & Gonda Brain Science Center & BINA Nano-Technology Center Bar Ilan University, Bar Ilan University, Israel.
Philos Trans A Math Phys Eng Sci. 2025 Jan 30;383(2289):20240136. doi: 10.1098/rsta.2024.0136.
The emergence of collective order in swarms from local, myopic interactions of their individual members is of interest to biology, sociology, psychology, computer science, robotics, physics and economics. , whose members unknowingly work towards a common goal, are particularly perplexing: members sometimes take individual actions that maximize collective utility, at the expense of their own. This seems to contradict expectations of individual rationality. Moreover, members choose these actions without knowing their effect on the collective utility. I examine this puzzle through game theory, machine learning and robots. I show that in some settings, the can be transformed into that can be measured locally: when interacting, members individually choose actions that receive a reward based on how quickly the interaction was resolved, how much individual work time is gained and the approximate effect on others. This internally measurable reward is individually and independently maximized by learning. This results in a equilibrium, where the learned response of each individual maximizes both its individual reward and the collective utility, i.e. both the swarm and the individuals are rational.This article is part of the theme issue 'The road forward with swarm systems'.
群体中个体成员通过局部的、短视的相互作用产生集体秩序,这一现象在生物学、社会学、心理学、计算机科学、机器人技术、物理学和经济学领域都备受关注。那些成员在不知不觉中朝着共同目标努力的群体尤其令人困惑:成员有时会采取以牺牲自身利益为代价来最大化集体效用的个体行动。这似乎与个体理性的预期相矛盾。此外,成员在选择这些行动时并不知道它们对集体效用的影响。我通过博弈论、机器学习和机器人技术来研究这个谜题。我表明,在某些情况下,群体行为可以转化为可以在局部进行衡量的行为:在相互作用时,成员各自根据互动解决的速度、获得的个人工作时间以及对他人的大致影响来选择能获得奖励的行动。这种内部可衡量的奖励通过学习被个体独立地最大化。这导致了一种均衡状态,即每个个体的学习反应既能最大化其个人奖励,又能最大化集体效用,也就是说,群体和个体都是理性的。本文是主题为“群体系统的前进之路”的一部分。