Huang Chien-Ming, Andrist Sean, Sauppé Allison, Mutlu Bilge
Department of Computer Sciences, University of Wisconsin-Madison Madison, WI, USA.
Front Psychol. 2015 Jul 24;6:1049. doi: 10.3389/fpsyg.2015.01049. eCollection 2015.
In everyday interactions, humans naturally exhibit behavioral cues, such as gaze and head movements, that signal their intentions while interpreting the behavioral cues of others to predict their intentions. Such intention prediction enables each partner to adapt their behaviors to the intent of others, serving a critical role in joint action where parties work together to achieve a common goal. Among behavioral cues, eye gaze is particularly important in understanding a person's attention and intention. In this work, we seek to quantify how gaze patterns may indicate a person's intention. Our investigation was contextualized in a dyadic sandwich-making scenario in which a "worker" prepared a sandwich by adding ingredients requested by a "customer." In this context, we investigated the extent to which the customers' gaze cues serve as predictors of which ingredients they intend to request. Predictive features were derived to represent characteristics of the customers' gaze patterns. We developed a support vector machine-based (SVM-based) model that achieved 76% accuracy in predicting the customers' intended requests based solely on gaze features. Moreover, the predictor made correct predictions approximately 1.8 s before the spoken request from the customer. We further analyzed several episodes of interactions from our data to develop a deeper understanding of the scenarios where our predictor succeeded and failed in making correct predictions. These analyses revealed additional gaze patterns that may be leveraged to improve intention prediction. This work highlights gaze cues as a significant resource for understanding human intentions and informs the design of real-time recognizers of user intention for intelligent systems, such as assistive robots and ubiquitous devices, that may enable more complex capabilities and improved user experience.
在日常互动中,人类会自然地展现出行为线索,如目光和头部动作,这些线索在传达自身意图的同时,也会解读他人的行为线索以预测其意图。这种意图预测使每个参与者能够根据他人的意图调整自己的行为,在各方共同努力实现共同目标的联合行动中发挥着关键作用。在行为线索中,目光注视在理解一个人的注意力和意图方面尤为重要。在这项研究中,我们试图量化目光模式如何可能表明一个人的意图。我们的调查以二元制三明治制作场景为背景,其中“工人”按照“顾客”要求添加食材来制作三明治。在此背景下,我们研究了顾客的目光线索在多大程度上可作为他们打算要求添加何种食材的预测指标。我们提取了预测特征来代表顾客目光模式的特点。我们开发了一个基于支持向量机(SVM)的模型,该模型仅基于目光特征就能以76%的准确率预测顾客的预期要求。此外,该预测器在顾客口头提出要求前约1.8秒就能做出正确预测。我们进一步分析了数据中的几段互动情节,以更深入地了解我们的预测器做出正确预测和预测失败的场景。这些分析揭示了其他可能有助于改进意图预测的目光模式。这项研究突出了目光线索作为理解人类意图的重要资源,并为智能系统(如辅助机器人和普及设备)的用户意图实时识别器的设计提供了参考,这些系统可能具备更复杂的功能并改善用户体验。