Suppr超能文献

使用随机主体模型优化分歧兴趣默契协调博弈中的表现。

Using a Stochastic Agent Model to Optimize Performance in Divergent Interest Tacit Coordination Games.

机构信息

Department of Industrial Engineering and Management, Ariel University, Ariel 40700, Israel.

Data Science and Artificial Intelligence Research Center, Ariel University, Ariel 40700, Israel.

出版信息

Sensors (Basel). 2020 Dec 8;20(24):7026. doi: 10.3390/s20247026.

Abstract

In recent years collaborative robots have become major market drivers in industry 5.0, which aims to incorporate them alongside humans in a wide array of settings ranging from welding to rehabilitation. Improving human-machine collaboration entails using computational algorithms that will save processing as well as communication cost. In this study we have constructed an agent that can choose when to cooperate using an optimal strategy. The agent was designed to operate in the context of divergent interest tacit coordination games in which communication between the players is not possible and the payoff is not symmetric. The agent's model was based on a behavioral model that can predict the probability of a player converging on prominent solutions with salient features (e.g., focal points) based on the player's Social Value Orientation (SVO) and the specific game features. The SVO theory pertains to the preferences of decision makers when allocating joint resources between themselves and another player in the context of behavioral game theory. The agent selected stochastically between one of two possible policies, a greedy or a cooperative policy, based on the probability of a player to converge on a focal point. The distribution of the number of points obtained by the autonomous agent incorporating the SVO in the model was better than the results obtained by the human players who played against each other (i.e., the distribution associated with the agent had a higher mean value). Moreover, the distribution of points gained by the agent was better than any of the separate strategies the agent could choose from, namely, always choosing a greedy or a focal point solution. To the best of our knowledge, this is the first attempt to construct an intelligent agent that maximizes its utility by incorporating the belief system of the player in the context of tacit bargaining. This reward-maximizing strategy selection process based on the SVO can also be potentially applied in other human-machine contexts, including multiagent systems.

摘要

近年来,协作机器人已成为工业 5.0 的主要市场驱动因素,旨在将其与人类一起应用于各种场景,从焊接到康复。提高人机协作能力需要使用计算算法,以节省处理和通信成本。在本研究中,我们构建了一个可以使用最优策略选择合作时机的智能体。该智能体被设计为在分歧利益默契协调博弈的背景下运行,其中玩家之间无法进行通信,且收益不对称。该智能体的模型基于一种行为模型,可以根据玩家的社会价值取向(SVO)和特定游戏特征,预测玩家根据显著特征(如焦点)收敛到突出解决方案的概率。SVO 理论涉及到决策者在行为博弈论背景下,在自己和另一个玩家之间分配共同资源时的偏好。该智能体根据玩家收敛到焦点的概率,随机选择两种可能策略之一,即贪婪策略或合作策略。该模型中纳入 SVO 的自主智能体获得的点数分布优于相互竞争的人类玩家的结果(即,与智能体相关联的分布具有更高的平均值)。此外,智能体获得的点数分布优于其可以选择的任何单独策略,即始终选择贪婪或焦点解决方案。据我们所知,这是首次尝试在默契谈判的背景下,通过纳入玩家的信念系统,构建一个最大化其效用的智能体。这种基于 SVO 的奖励最大化策略选择过程也可以潜在地应用于其他人机交互环境,包括多智能体系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff51/7763831/d8082f7b23ea/sensors-20-07026-g0A1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验