Maeda Shion, Chauvet Nicolas, Saigo Hayato, Hori Hirokazu, Bachelier Guillaume, Huant Serge, Naruse Makoto
Department of Mathematical Engineering and Information Physics, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.
Department of Information Physics and Computing, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.
Sci Rep. 2021 Mar 1;11(1):4832. doi: 10.1038/s41598-021-84199-5.
Collective decision making is important for maximizing total benefits while preserving equality among individuals in the competitive multi-armed bandit (CMAB) problem, wherein multiple players try to gain higher rewards from multiple slot machines. The CMAB problem represents an essential aspect of applications such as resource management in social infrastructure. In a previous study, we theoretically and experimentally demonstrated that entangled photons can physically resolve the difficulty of the CMAB problem. This decision-making strategy completely avoids decision conflicts while ensuring equality. However, decision conflicts can sometimes be beneficial if they yield greater rewards than non-conflicting decisions, indicating that greedy actions may provide positive effects depending on the given environment. In this study, we demonstrate a mixed strategy of entangled- and correlated-photon-based decision-making so that total rewards can be enhanced when compared to the entangled-photon-only decision strategy. We show that an optimal mixture of entangled- and correlated-photon-based strategies exists depending on the dynamics of the reward environment as well as the difficulty of the given problem. This study paves the way for utilizing both quantum and classical aspects of photons in a mixed manner for decision making and provides yet another example of the supremacy of mixed strategies known in game theory, especially in evolutionary game theory.
在竞争性多臂老虎机(CMAB)问题中,集体决策对于在保持个体间平等的同时最大化总收益很重要,在该问题中,多个参与者试图从多个老虎机中获得更高奖励。CMAB问题代表了社会基础设施资源管理等应用的一个重要方面。在先前的一项研究中,我们从理论和实验上证明了纠缠光子可以从物理上解决CMAB问题的困难。这种决策策略在确保平等的同时完全避免了决策冲突。然而,如果决策冲突比非冲突决策产生更大的奖励,那么决策冲突有时可能是有益的,这表明贪婪行为可能根据给定环境产生积极影响。在本研究中,我们展示了一种基于纠缠光子和关联光子的混合决策策略,与仅基于纠缠光子的决策策略相比,这样可以提高总奖励。我们表明,根据奖励环境的动态以及给定问题的难度,存在基于纠缠光子和关联光子策略的最优混合。本研究为以混合方式利用光子的量子和经典方面进行决策铺平了道路,并提供了博弈论(尤其是进化博弈论)中已知的混合策略优越性的另一个例子。