Suppr超能文献

基于图注意力网络的多头跨模态注意力多模态情感识别模型

Multi-Modality Emotion Recognition Model with GAT-Based Multi-Head Inter-Modality Attention.

作者信息

Fu Changzeng, Liu Chaoran, Ishi Carlos Toshinori, Ishiguro Hiroshi

机构信息

Advanced Telecommunications Research Institute International, Kyoto 619-0288, Japan.

Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan;.

出版信息

Sensors (Basel). 2020 Aug 29;20(17):4894. doi: 10.3390/s20174894.

Abstract

Emotion recognition has been gaining attention in recent years due to its applications on artificial agents. To achieve a good performance with this task, much research has been conducted on the multi-modality emotion recognition model for leveraging the different strengths of each modality. However, a research question remains: what exactly is the most appropriate way to fuse the information from different modalities? In this paper, we proposed audio sample augmentation and an emotion-oriented encoder-decoder to improve the performance of emotion recognition and discussed an inter-modality, decision-level fusion method based on a graph attention network (GAT). Compared to the baseline, our model improved the weighted average F1-scores from 64.18 to 68.31% and the weighted average accuracy from 65.25 to 69.88%.

摘要

近年来,情感识别因其在人工智能体上的应用而备受关注。为了在这项任务中取得良好的性能,人们对多模态情感识别模型进行了大量研究,以利用每种模态的不同优势。然而,一个研究问题仍然存在:究竟什么是融合来自不同模态信息的最合适方法?在本文中,我们提出了音频样本增强和面向情感的编码器 - 解码器来提高情感识别性能,并讨论了一种基于图注意力网络(GAT)的跨模态决策级融合方法。与基线相比,我们的模型将加权平均F1分数从64.18%提高到了68.31%,加权平均准确率从65.25%提高到了69.88%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45cf/7506856/4fc4f6e72c3c/sensors-20-04894-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验