Suppr超能文献

面对面互动中语音驱动的注视

Speech Driven Gaze in a Face-to-Face Interaction.

作者信息

Arslan Aydin Ülkü, Kalkan Sinan, Acartürk Cengiz

机构信息

Cognitive Science Department, Middle East Technical University, Ankara, Turkey.

Computer Engineering Department, Middle East Technical University, Ankara, Turkey.

出版信息

Front Neurorobot. 2021 Mar 4;15:598895. doi: 10.3389/fnbot.2021.598895. eCollection 2021.

Abstract

Gaze and language are major pillars in multimodal communication. Gaze is a non-verbal mechanism that conveys crucial social signals in face-to-face conversation. However, compared to language, gaze has been less studied as a communication modality. The purpose of the present study is 2-fold: (i) to investigate gaze direction (i.e., aversion and face gaze) and its relation to speech in a face-to-face interaction; and (ii) to propose a computational model for multimodal communication, which predicts gaze direction using high-level speech features. Twenty-eight pairs of participants participated in data collection. The experimental setting was a mock job interview. The eye movements were recorded for both participants. The speech data were annotated by ISO 24617-2 Standard for Dialogue Act Annotation, as well as manual tags based on previous social gaze studies. A comparative analysis was conducted by Convolutional Neural Network (CNN) models that employed specific architectures, namely, VGGNet and ResNet. The results showed that the frequency and the duration of gaze differ significantly depending on the role of participant. Moreover, the ResNet models achieve higher than 70% accuracy in predicting gaze direction.

摘要

注视和语言是多模态交流的主要支柱。注视是一种非语言机制,在面对面交谈中传达关键的社交信号。然而,与语言相比,注视作为一种交流方式的研究较少。本研究的目的有两个:(i)在面对面互动中研究注视方向(即回避和面部注视)及其与言语的关系;(ii)提出一种多模态交流的计算模型,该模型使用高级语音特征预测注视方向。28对参与者参与了数据收集。实验场景是模拟求职面试。记录了两位参与者的眼动。语音数据根据ISO 24617-2对话行为标注标准以及基于先前社会注视研究的手动标签进行标注。采用特定架构(即VGGNet和ResNet)的卷积神经网络(CNN)模型进行了对比分析。结果表明,注视的频率和持续时间因参与者的角色不同而有显著差异。此外,ResNet模型在预测注视方向方面的准确率高于70%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eed5/7970197/774eba262db9/fnbot-15-598895-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验