Department of Statistics, University of California Los Angeles, Los Angeles, California, USA.
Department of Psychology, University of California Los Angeles, Los Angeles, California, USA.
J Comput Biol. 2022 Sep;29(9):1022-1030. doi: 10.1089/cmb.2021.0549. Epub 2022 Jun 24.
Coordinated hunting is widely observed in animals, and sharing rewards is often considered a major incentive for its success. While current theories about the role played by sharing in coordinated hunting are based on correlational evidence, we reveal the causal roles of sharing rewards through computational modeling with a state-of-the-art Multi-agent Reinforcement Learning (MARL) algorithm. We show that counterintuitively, while selfish agents reach robust coordination, sharing rewards undermines coordination. Hunting coordination modeled through sharing rewards (1) suffers from the free-rider problem, (2) plateaus at a small group size, and (3) is not a Nash equilibrium. Moreover, individually rewarded predators outperform predators that share rewards, especially when the hunting is difficult, the group size is large, and the action cost is high. Our results shed new light on the actual importance of prosocial motives for successful coordination in nonhuman animals and suggest that sharing rewards might simply be a byproduct of hunting, instead of a design strategy aimed at facilitating group coordination. This also highlights that current artificial intelligence modeling of human-like coordination in a group setting that assumes rewards sharing as a motivator (e.g., MARL) might not be adequately capturing what is truly necessary for successful coordination.
协调狩猎在动物中广泛存在,而分享奖励通常被认为是其成功的主要动机。尽管当前关于分享在协调狩猎中所起作用的理论是基于相关性证据,但我们通过使用最先进的多智能体强化学习(MARL)算法进行计算建模,揭示了分享奖励的因果作用。我们发现,具有反直觉性的是,尽管自私的代理能够实现稳健的协调,但分享奖励会破坏协调。通过分享奖励来建模的狩猎协调(1)存在搭便车问题,(2)在小群体规模下停滞不前,(3)不是纳什均衡。此外,个体奖励的捕食者表现优于分享奖励的捕食者,尤其是在狩猎困难、群体规模大且行动成本高的情况下。我们的研究结果为非人类动物中成功协调的亲社会动机的实际重要性提供了新的视角,并表明分享奖励可能只是狩猎的副产品,而不是旨在促进群体协调的设计策略。这也突显了当前在群体环境中对类人协调进行的人工智能建模,其中假设分享奖励是一种激励因素(例如 MARL),可能无法充分捕捉到成功协调所需的真正因素。