Universidad de Buenos, Facultad de Ingeniería, Instituto de Ingeniería Biomédica, Aires, Buenos Aires, Argentina.
Instituto de Biología y Medicina Experimental (IBYME-CONICET), Laboratorio de Biología del Comportamiento, Ciudad de Buenos Aires, Buenos Aires, Argentina.
PLoS One. 2019 Jan 2;14(1):e0204837. doi: 10.1371/journal.pone.0204837. eCollection 2019.
Cooperation is one of the most studied paradigms for the understanding of social interactions. Reciprocal altruism -a special type of cooperation that is taught by means of the iterated prisoner dilemma game (iPD)- has been shown to emerge in different species with different success rates. When playing iPD against a reciprocal opponent, the larger theoretical long-term reward is delivered when both players cooperate mutually. In this work, we trained rats in iPD against an opponent playing a Tit for Tat strategy, using a payoff matrix with positive and negative reinforcements, that is food and timeout respectively. We showed for the first time, that experimental rats were able to learn reciprocal altruism with a high average cooperation rate, where the most probable state was mutual cooperation (85%). Although when subjects defected, the most probable behavior was to go back to mutual cooperation. When we modified the matrix by increasing temptation rewards (T) or by increasing cooperation rewards (R), the cooperation rate decreased. In conclusion, we observe that an iPD matrix with large positive reward improves less cooperation than one with small rewards, shown that satisfying the relationship among iPD reinforcement was not enough to achieve high mutual cooperation behavior. Therefore, using positive and negative reinforcements and an appropriate contrast between rewards, rats have cognitive capacity to learn reciprocal altruism. This finding allows to infer that the learning of reciprocal altruism has early appeared in evolution.
合作是理解社会互动最常用的范例之一。互惠利他主义是一种特殊类型的合作,它通过重复囚徒困境博弈(iPD)来教授,已被证明在不同物种中以不同的成功率出现。当与互惠对手进行 iPD 博弈时,当两个玩家相互合作时,会获得更大的理论长期回报。在这项工作中,我们使用具有正强化和负强化的收益矩阵(分别是食物和超时),在 iPD 中对大鼠进行训练,对抗采用以牙还牙策略的对手。我们首次表明,实验大鼠能够以高平均合作率学习互惠利他主义,其中最可能的状态是相互合作(85%)。尽管当主体背叛时,最可能的行为是回到相互合作。当我们通过增加诱惑奖励(T)或增加合作奖励(R)来修改矩阵时,合作率下降。总之,我们观察到具有较大正奖励的 iPD 矩阵提高的合作率不如具有较小奖励的矩阵高,这表明满足 iPD 强化之间的关系不足以实现高相互合作行为。因此,使用正强化和负强化以及奖励之间的适当对比,大鼠具有学习互惠利他主义的认知能力。这一发现表明,互惠利他主义的学习在进化早期就已经出现。