Duéñez-Guzmán Edgar A, Comanescu Ramona, Mao Yiran, McKee Kevin R, Coppin Ben, Sadedin Suzanne, Chiappa Silvia, Vezhnevets Alexander S, Bakker Michiel A, Bachrach Yoram, Isaac William, Tuyls Karl, Leibo Joel Z
Google DeepMind, Google UK Ltd., London EC4A 3TW, United Kingdom.
Independent researcher, London N1C 4DN, United Kingdom.
Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2319933121. doi: 10.1073/pnas.2319933121. Epub 2025 Jun 16.
Choosing social partners is a potentially demanding task which involves paying attention to the right information while disregarding salient but possibly irrelevant features. The resultant trade-off between cost of evaluation and quality of decisions can lead to undesired bias. Information-processing abilities mediate this trade-off, where individuals with higher ability choose better partners leading to higher performance. By altering the salience of features, technology can modulate the effect of information-processing limits, potentially increasing or decreasing undesired biases. Here, we use game theory and multiagent reinforcement learning to investigate how undesired biases emerge, and how a technological layer (in the form of a perceptual intervention) between individuals and their environment can ameliorate such biases. Our results show that a perceptual intervention designed to increase the salience of outcome-relevant features can reduce bias in agents making partner choice decisions. Individuals learning with a perceptual intervention showed less bias due to decreased reliance on features that only spuriously correlate with behavior. Mechanistically, the perceptual intervention effectively increased the information-processing abilities of the individuals. Our results highlight the benefit of using multiagent reinforcement learning to model theoretically grounded social behaviors, particularly when real-world complexity prohibits fully analytical approaches.
选择社交伙伴是一项颇具挑战性的任务,它要求关注正确信息,同时忽略那些显著但可能无关的特征。评估成本与决策质量之间的权衡可能导致不良偏差。信息处理能力在这种权衡中起调节作用,能力较强的个体能够选择更好的伙伴,从而带来更高的表现。通过改变特征的显著性,技术可以调节信息处理限制的影响,可能增加或减少不良偏差。在此,我们运用博弈论和多智能体强化学习来研究不良偏差是如何产生的,以及个体与其环境之间的技术层(以感知干预的形式)如何能够改善此类偏差。我们的结果表明,旨在提高与结果相关特征显著性的感知干预能够减少智能体在做出伙伴选择决策时的偏差。通过感知干预进行学习的个体表现出的偏差较少,这是因为他们对仅与行为虚假相关的特征的依赖减少了。从机制上讲,感知干预有效地提高了个体的信息处理能力。我们的结果凸显了使用多智能体强化学习来模拟基于理论的社会行为的益处,特别是在现实世界的复杂性使得完全采用分析方法变得不可能的情况下。