Suppr超能文献

艺术评论家:游戏语境中的多信号视觉和语音交互系统。

Art critic: Multisignal vision and speech interaction system in a gaming context.

出版信息

IEEE Trans Cybern. 2013 Dec;43(6):1546-59. doi: 10.1109/TCYB.2013.2271606.

Abstract

True immersion of a player within a game can only occur when the world simulated looks and behaves as close to reality as possible. This implies that the game must correctly read and understand, among other things, the player's focus, attitude toward the objects/persons in focus, gestures, and speech. In this paper, we proposed a novel system that integrates eye gaze estimation, head pose estimation, facial expression recognition, speech recognition, and text-to-speech components for use in real-time games. Both the eye gaze and head pose components utilize underlying 3-D models, and our novel head pose estimation algorithm uniquely combines scene flow with a generic head model. The facial expression recognition module uses the local binary patterns with three orthogonal planes approach on the 2-D shape index domain rather than the pixel domain, resulting in improved classification. Our system has also been extended to use a pan-tilt-zoom camera driven by the Kinect, allowing us to track a moving player. A test game, Art Critic, is also presented, which not only demonstrates the utility of our system but also provides a template for player/non-player character (NPC) interaction in a gaming context. The player alters his/her view of the 3-D world using head pose, looks at paintings/NPCs using eye gaze, and makes an evaluation based on the player's expression and speech. The NPC artist will respond with facial expression and synthetic speech based on its personality. Both qualitative and quantitative evaluations of the system are performed to illustrate the system's effectiveness.

摘要

只有当模拟的世界看起来和行为尽可能接近现实时,玩家才能真正沉浸在游戏中。这意味着游戏必须正确读取和理解玩家的关注点、对焦点物体/人物的态度、手势和语音等信息。在本文中,我们提出了一种新的系统,该系统集成了眼动追踪、头部姿势估计、面部表情识别、语音识别和文语转换组件,可用于实时游戏。眼动和头部姿势组件都利用了底层的 3D 模型,我们新颖的头部姿势估计算法独特地将场景流与通用头部模型相结合。面部表情识别模块在二维形状指数域上使用三个正交平面的局部二值模式方法,而不是像素域,从而提高了分类效果。我们的系统还扩展到使用 Kinect 驱动的平移-倾斜-缩放摄像机,允许我们跟踪移动的玩家。还展示了一个名为“艺术评论家”的测试游戏,该游戏不仅展示了我们系统的实用性,还为游戏环境中的玩家/非玩家角色(NPC)交互提供了模板。玩家使用头部姿势改变对 3D 世界的看法,使用眼动追踪观看绘画/NPC,并根据玩家的表情和语音进行评估。NPC 艺术家将根据其个性用面部表情和合成语音做出回应。系统的定性和定量评估都表明了系统的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验