• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

词语可变换:利用非语言行为动态调整词语表征

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors.

作者信息

Wang Yansen, Shen Ying, Liu Zhun, Liang Paul Pu, Zadeh Amir, Morency Louis-Philippe

机构信息

Department of Computer Science, Tsinghua University.

School of Computer Science, Carnegie Mellon University.

出版信息

Proc AAAI Conf Artif Intell. 2019 Jul;33(1):7216-7223.

PMID:32219010
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7098710/
Abstract

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.

摘要

在面对面交流中,人类通过使用言语和非言语行为来传达意图。说话者的意图常常会根据不同的非言语情境动态变化,比如语音模式和面部表情。因此,在对人类语言进行建模时,不仅要考虑单词的字面意思,还要考虑这些单词出现时的非言语情境。为了更好地对人类语言进行建模,我们首先通过分析单词片段中出现的细粒度视觉和声学模式来对富有表现力的非言语表征进行建模。此外,我们试图通过根据伴随的非言语行为来转移单词表征,以捕捉非言语意图的动态性质。为此,我们提出了循环关注变化嵌入网络(RAVEN),该网络对非言语子词序列的细粒度结构进行建模,并根据非言语线索动态转移单词表征。我们提出的模型在两个用于多模态情感分析和情感识别的公开可用数据集上取得了有竞争力的性能。我们还可视化了不同非言语情境下转移后的单词表征,并总结了关于单词表征多模态变化的常见模式。

相似文献

1
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors.词语可变换:利用非语言行为动态调整词语表征
Proc AAAI Conf Artif Intell. 2019 Jul;33(1):7216-7223.
2
Integrating Multimodal Information in Large Pretrained Transformers.在大型预训练变压器中整合多模态信息。
Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:2359-2369. doi: 10.18653/v1/2020.acl-main.214.
3
Unsupervised Word Embedding Learning by Incorporating Local and Global Contexts.通过融合局部和全局上下文进行无监督词嵌入学习
Front Big Data. 2020 Mar 11;3:9. doi: 10.3389/fdata.2020.00009. eCollection 2020.
4
Multi-attention Recurrent Network for Human Communication Comprehension.用于人类交流理解的多注意力循环网络。
Proc AAAI Conf Artif Intell. 2018 Feb;2018:5642-5649.
5
Effects of cue modality and emotional category on recognition of nonverbal emotional signals in schizophrenia.线索模态和情绪类别对精神分裂症患者非言语情绪信号识别的影响。
BMC Psychiatry. 2016 Jul 7;16:218. doi: 10.1186/s12888-016-0913-7.
6
How the Brain Dynamically Constructs Sentence-Level Meanings From Word-Level Features.大脑如何从单词层面的特征动态构建句子层面的意义。
Front Artif Intell. 2022 Apr 21;5:733163. doi: 10.3389/frai.2022.733163. eCollection 2022.
7
Word meaning types acquired before vs. after age 5: implications for education.5岁之前与之后习得的词义类型:对教育的启示
Front Psychol. 2024 Apr 5;15:1280568. doi: 10.3389/fpsyg.2024.1280568. eCollection 2024.
8
Probing Lexical Ambiguity: Word Vectors Encode Number and Relatedness of Senses.探究词汇歧义:词向量编码词义的数量和关联性。
Cogn Sci. 2021 May;45(5):e12943. doi: 10.1111/cogs.12943.
9
ReCANVo: A database of real-world communicative and affective nonverbal vocalizations.ReCANVo:一个真实世界交际和情感非言语发声的数据库。
Sci Data. 2023 Aug 5;10(1):523. doi: 10.1038/s41597-023-02405-7.
10
Subword Representations Successfully Decode Brain Responses to Morphologically Complex Written Words.子词表征成功解码大脑对形态复杂书面单词的反应。
Neurobiol Lang (Camb). 2024 Sep 11;5(4):844-863. doi: 10.1162/nol_a_00149. eCollection 2024.

引用本文的文献

1
Analysis of the fusion of multimodal sentiment perception and physiological signals in Chinese-English cross-cultural communication: Transformer approach incorporating self-attention enhancement.汉英跨文化交际中多模态情感感知与生理信号融合分析:融入自注意力增强的Transformer方法
PeerJ Comput Sci. 2025 May 23;11:e2890. doi: 10.7717/peerj-cs.2890. eCollection 2025.
2
Research on Multimodal Fusion of Temporal Electronic Medical Records.时间电子病历的多模态融合研究
Bioengineering (Basel). 2024 Jan 18;11(1):94. doi: 10.3390/bioengineering11010094.
3
A Survey of Deep Learning-Based Multimodal Emotion Recognition: Speech, Text, and Face.基于深度学习的多模态情感识别综述:语音、文本和面部
Entropy (Basel). 2023 Oct 12;25(10):1440. doi: 10.3390/e25101440.
4
GCF-Net: global-aware cross-modal feature fusion network for speech emotion recognition.GCF-Net:用于语音情感识别的全局感知跨模态特征融合网络
Front Neurosci. 2023 May 4;17:1183132. doi: 10.3389/fnins.2023.1183132. eCollection 2023.
5
Multimodal Emotion Recognition Based on Cascaded Multichannel and Hierarchical Fusion.基于级联多通道和分层融合的多模态情绪识别。
Comput Intell Neurosci. 2023 Jan 5;2023:9645611. doi: 10.1155/2023/9645611. eCollection 2023.
6
Deep learning on multi-view sequential data: a survey.多视图序列数据的深度学习:一项综述。
Artif Intell Rev. 2023;56(7):6661-6704. doi: 10.1007/s10462-022-10332-z. Epub 2022 Nov 29.
7
Application of big data and artificial intelligence in epidemic surveillance and containment.大数据与人工智能在疫情监测与防控中的应用。
Intell Med. 2023 Feb;3(1):36-43. doi: 10.1016/j.imed.2022.10.003. Epub 2022 Nov 5.
8
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model.AFR-BERT:基于注意力机制的特征相关融合多模态情感分析模型。
PLoS One. 2022 Sep 9;17(9):e0273936. doi: 10.1371/journal.pone.0273936. eCollection 2022.
9
Sentiment Analysis and Emotion Recognition from Speech Using Universal Speech Representations.基于通用语音表示的语音情感分析和情绪识别。
Sensors (Basel). 2022 Aug 24;22(17):6369. doi: 10.3390/s22176369.
10
Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks.基于跨模态注意力和门控循环层次融合网络的多模态情感分析。
Comput Intell Neurosci. 2022 Aug 9;2022:4767437. doi: 10.1155/2022/4767437. eCollection 2022.

本文引用的文献

1
Multi-attention Recurrent Network for Human Communication Comprehension.用于人类交流理解的多注意力循环网络。
Proc AAAI Conf Artif Intell. 2018 Feb;2018:5642-5649.
2
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment.基于词级对齐的分层注意力策略的多模态情感分析
Proc Conf Assoc Comput Linguist Meet. 2018 Jul;2018:2225-2235.
3
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.