School of Computer and Software, Nanjing University of Information Science & Technology, Nanjing 210044, China.
Sensors (Basel). 2021 Dec 23;22(1):74. doi: 10.3390/s22010074.
The rising use of online media has changed the social customs of the public. Users have become accustomed to sharing daily experiences and publishing personal opinions on social networks. Social data carrying emotion and attitude has provided significant decision support for numerous tasks in sentiment analysis. Conventional methods for sentiment classification only concern textual modality and are vulnerable to the multimodal scenario, while common multimodal approaches only focus on the interactive relationship among modalities without considering unique intra-modal information. A hybrid fusion network is proposed in this paper to capture both inter-modal and intra-modal features. Firstly, in the stage of representation fusion, a multi-head visual attention is proposed to extract accurate semantic and sentimental information from textual contents, with the guidance of visual features. Then, multiple base classifiers are trained to learn independent and diverse discriminative information from different modal representations in the stage of decision fusion. The final decision is determined based on fusing the decision supports from base classifiers via a decision fusion method. To improve the generalization of our hybrid fusion network, a similarity loss is employed to inject decision diversity into the whole model. Empiric results on five multimodal datasets have demonstrated that the proposed model achieves higher accuracy and better generalization capacity for multimodal sentiment analysis.
在线媒体的兴起改变了公众的社会习俗。用户已经习惯在社交网络上分享日常经验和发布个人意见。带有情感和态度的社交数据为情感分析中的众多任务提供了重要的决策支持。传统的情感分类方法只关注文本模态,容易受到多模态场景的影响,而常见的多模态方法只关注模态之间的交互关系,而不考虑模态内部的独特信息。本文提出了一种混合融合网络来捕获模态间和模态内的特征。首先,在表示融合阶段,提出了一种多头视觉注意力机制,以在视觉特征的指导下从文本内容中提取准确的语义和情感信息。然后,在决策融合阶段,通过多个基础分类器从不同模态表示中学习独立和多样化的判别信息。最终决策是通过融合基础分类器的决策支持来确定的。为了提高混合融合网络的泛化能力,我们采用了相似性损失来将决策多样性注入整个模型。在五个多模态数据集上的实验结果表明,所提出的模型在多模态情感分析中具有更高的准确性和更好的泛化能力。