Zamora J, Millán J R, Murciano A
Departamento de Biomatemática, Universidad Complutense de Madrid, Spain.
Biol Cybern. 1998 Mar;78(3):197-205. doi: 10.1007/s004220050426.
Optimization of performance in collective systems often requires altruism. The emergence and stabilization of altruistic behaviors are difficult to achieve because the agents incur a cost when behaving altruistically. In this paper, we propose a biologically inspired strategy to learn stable altruistic behaviors in artificial multi-agent systems, namely reciprocal altruism. This strategy in conjunction with learning capabilities make altruistic agents cooperate only between themselves, thus preventing their exploitation by selfish agents, if future benefits are greater than the current cost of altruistic acts. Our multi-agent system is made up of agents with a behavior-based architecture. Agents learn the most suitable cooperative strategy for different environments by means of a reinforcement learning algorithm. Each agent receives a reinforcement signal that only measures its individual performance. Simulation results show how the multi-agent system learns stable altruistic behaviors, so achieving optimal (or near-to-optimal) performances in unknown and changing environments.
集体系统中性能的优化通常需要利他主义。利他行为的出现和稳定很难实现,因为个体在表现出利他行为时会付出代价。在本文中,我们提出了一种受生物启发的策略,用于在人工多智能体系统中学习稳定的利他行为,即互惠利他主义。这种策略与学习能力相结合,使得利他智能体仅在彼此之间进行合作,从而在未来收益大于当前利他行为成本的情况下,防止被自私智能体利用。我们的多智能体系统由具有基于行为架构的智能体组成。智能体通过强化学习算法学习适合不同环境的最佳合作策略。每个智能体接收一个仅衡量其个体性能的强化信号。仿真结果表明了多智能体系统如何学习稳定的利他行为,从而在未知和不断变化的环境中实现最优(或接近最优)性能。