Suppr超能文献

多模态语音数据的定量分析

Quantitative Analysis of Multimodal Speech Data.

作者信息

Gordon Danner Samantha, Vilela Barbosa Adriano, Goldstein Louis

机构信息

University of Southern California, Los Angeles, CA, 90089, USA.

Federal University of Minas Gerais, Belo Horizonte, MG, CEP 31270-901, Brazil.

出版信息

J Phon. 2018 Nov;71:268-283. doi: 10.1016/j.wocn.2018.09.007. Epub 2018 Oct 19.

Abstract

This study presents techniques for quantitatively analyzing coordination and kinematics in multimodal speech using video, audio and electromagnetic articulography (EMA) data. Multimodal speech research has flourished due to recent improvements in technology, yet gesture detection/annotation strategies vary widely, leading to difficulty in generalizing across studies and in advancing this field of research. We describe how FlowAnalyzer software can be used to extract kinematic signals from basic video recordings; and we apply a technique, derived from speech kinematic research, to detect bodily gestures in these kinematic signals. We investigate whether kinematic characteristics of multimodal speech differ dependent on communicative context, and we find that these contexts be distinguished quantitatively, suggesting a way to improve and standardize existing gesture identification/annotation strategy. We also discuss a method, Correlation Map Analysis (CMA), for quantifying the relationship between speech and bodily gesture kinematics over time. We describe potential applications of CMA to multimodal speech research, such as describing characteristics of speech-gesture coordination in different communicative contexts. The use of the techniques presented here can improve and advance multimodal speech and gesture research by applying quantitative methods in the detection and description of multimodal speech.

摘要

本研究介绍了利用视频、音频和电磁关节造影(EMA)数据对多模态语音中的协调性和运动学进行定量分析的技术。由于近期技术的进步,多模态语音研究蓬勃发展,但手势检测/标注策略差异很大,导致跨研究的概括以及该研究领域的推进都存在困难。我们描述了如何使用FlowAnalyzer软件从基本视频记录中提取运动学信号;并且我们应用一种源自语音运动学研究的技术,来检测这些运动学信号中的身体手势。我们研究多模态语音的运动学特征是否因交流语境而异,并且我们发现这些语境可以通过定量方式加以区分,这为改进和规范现有的手势识别/标注策略提供了一种方法。我们还讨论了一种用于量化语音和身体手势运动学随时间变化关系的方法,即相关映射分析(CMA)。我们描述了CMA在多模态语音研究中的潜在应用,例如描述不同交流语境中语音 - 手势协调的特征。通过在多模态语音的检测和描述中应用定量方法,此处介绍的技术的使用可以改进和推进多模态语音和手势研究。

相似文献

1
Quantitative Analysis of Multimodal Speech Data.多模态语音数据的定量分析
J Phon. 2018 Nov;71:268-283. doi: 10.1016/j.wocn.2018.09.007. Epub 2018 Oct 19.

本文引用的文献

2
Gesture for Linguists: A Handy Primer.语言学家的手势:实用入门指南。
Lang Linguist Compass. 2015 Nov 1;9(11):437-451. doi: 10.1111/lnc3.12168.
8
Behavior matching in multimodal communication is synchronized.多模态交流中的行为匹配是同步的。
Cogn Sci. 2012 Nov-Dec;36(8):1404-26. doi: 10.1111/j.1551-6709.2012.01269.x. Epub 2012 Sep 17.
10
The speech focus position effect on jaw-finger coordination in a pointing task.在指向任务中言语焦点位置对颌-指协调的影响。
J Speech Lang Hear Res. 2008 Dec;51(6):1507-21. doi: 10.1044/1092-4388(2008/07-0173). Epub 2008 Aug 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验