Suppr超能文献

用于人类交流理解的多注意力循环网络。

Multi-attention Recurrent Network for Human Communication Comprehension.

作者信息

Zadeh Amir, Liang Paul Pu, Poria Soujanya, Vij Prateek, Cambria Erik, Morency Louis-Philippe

机构信息

Carnegie Mellon University, USA.

NTU, Singapore.

出版信息

Proc AAAI Conf Artif Intell. 2018 Feb;2018:5642-5649.

Abstract

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape the communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art results performance in all the datasets.

摘要

人类面对面交流是一种复杂的多模态信号。我们使用言语(语言模态)、手势(视觉模态)和语调变化(声学模态)来传达意图。人类能够轻松处理和理解面对面交流,然而,理解这种交流形式对人工智能(AI)来说仍然是一项重大挑战。人工智能必须理解每种模态以及它们之间形成交流的相互作用。在本文中,我们提出了一种用于理解人类交流的新型神经架构,称为多注意力循环网络(MARN)。我们模型的主要优势在于通过一个名为多注意力模块(MAB)的神经组件在时间维度上发现模态之间的相互作用,并将其存储在一个名为长短时混合记忆(LSTHM)的循环组件的混合记忆中。我们在六个公开可用的多模态情感分析、说话者特征识别和情感识别数据集上进行了广泛比较。MARN在所有数据集中都展现出了领先的结果表现。

相似文献

4
Integrating Multimodal Information in Large Pretrained Transformers.在大型预训练变压器中整合多模态信息。
Proc Conf Assoc Comput Linguist Meet. 2020 Jul;2020:2359-2369. doi: 10.18653/v1/2020.acl-main.214.

引用本文的文献

本文引用的文献

1
Neural synchronization during face-to-face communication.面对面交流中的神经同步。
J Neurosci. 2012 Nov 7;32(45):16064-9. doi: 10.1523/JNEUROSCI.2926-12.2012.
3
Hidden conditional random fields.隐条件随机字段
IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1848-53. doi: 10.1109/TPAMI.2007.1124.
4
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验