Suppr超能文献

词语可变换:利用非语言行为动态调整词语表征

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors.

作者信息

Wang Yansen, Shen Ying, Liu Zhun, Liang Paul Pu, Zadeh Amir, Morency Louis-Philippe

机构信息

Department of Computer Science, Tsinghua University.

School of Computer Science, Carnegie Mellon University.

出版信息

Proc AAAI Conf Artif Intell. 2019 Jul;33(1):7216-7223.

Abstract

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.

摘要

在面对面交流中,人类通过使用言语和非言语行为来传达意图。说话者的意图常常会根据不同的非言语情境动态变化,比如语音模式和面部表情。因此,在对人类语言进行建模时,不仅要考虑单词的字面意思,还要考虑这些单词出现时的非言语情境。为了更好地对人类语言进行建模,我们首先通过分析单词片段中出现的细粒度视觉和声学模式来对富有表现力的非言语表征进行建模。此外,我们试图通过根据伴随的非言语行为来转移单词表征,以捕捉非言语意图的动态性质。为此,我们提出了循环关注变化嵌入网络(RAVEN),该网络对非言语子词序列的细粒度结构进行建模,并根据非言语线索动态转移单词表征。我们提出的模型在两个用于多模态情感分析和情感识别的公开可用数据集上取得了有竞争力的性能。我们还可视化了不同非言语情境下转移后的单词表征,并总结了关于单词表征多模态变化的常见模式。

相似文献

2
Integrating Multimodal Information in Large Pretrained Transformers.在大型预训练变压器中整合多模态信息。
Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:2359-2369. doi: 10.18653/v1/2020.acl-main.214.

引用本文的文献

2
Research on Multimodal Fusion of Temporal Electronic Medical Records.时间电子病历的多模态融合研究
Bioengineering (Basel). 2024 Jan 18;11(1):94. doi: 10.3390/bioengineering11010094.
6
Deep learning on multi-view sequential data: a survey.多视图序列数据的深度学习:一项综述。
Artif Intell Rev. 2023;56(7):6661-6704. doi: 10.1007/s10462-022-10332-z. Epub 2022 Nov 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验