Zhang Tianzhi, Zhou Gang, Lu Jicang, Li Zhibo, Wu Hao, Liu Shuo
Information Engineering University, Zhengzhou, Henan, China.
PeerJ Comput Sci. 2024 Apr 12;10:e1904. doi: 10.7717/peerj-cs.1904. eCollection 2024.
Aspect-based multimodal sentiment analysis (ABMSA) is an emerging task in the research of multimodal sentiment analysis, which aims to identify the sentiment of each aspect mentioned in multimodal sample. Although recent research on ABMSA has achieved some success, most existing models only adopt attention mechanism to interact aspect with text and image respectively and obtain sentiment output through multimodal concatenation, they often neglect to consider that some samples may not have semantic relevance between text and image. In this article, we propose a Text-Image Semantic Relevance Identification (TISRI) model for ABMSA to address the problem. Specifically, we introduce a multimodal feature relevance identification module to calculate the semantic similarity between text and image, and then construct an image gate to dynamically control the input image information. On this basis, an image auxiliary information is provided to enhance the semantic expression ability of visual feature representation to generate more intuitive image representation. Furthermore, we employ attention mechanism during multimodal feature fusion to obtain the text-aware image representation through text-image interaction to prevent irrelevant image information interfering our model. Experiments demonstrate that TISRI achieves competitive results on two ABMSA Twitter datasets, and then validate the effectiveness of our methods.
基于方面的多模态情感分析(ABMSA)是多模态情感分析研究中的一个新兴任务,旨在识别多模态样本中提及的各个方面的情感。尽管最近关于ABMSA的研究取得了一些成功,但大多数现有模型仅采用注意力机制分别将方面与文本和图像进行交互,并通过多模态拼接获得情感输出,它们常常忽略考虑一些样本的文本和图像之间可能不存在语义相关性。在本文中,我们提出了一种用于ABMSA的文本-图像语义相关性识别(TISRI)模型来解决该问题。具体而言,我们引入了一个多模态特征相关性识别模块来计算文本和图像之间的语义相似度,然后构建一个图像门来动态控制输入的图像信息。在此基础上,提供图像辅助信息以增强视觉特征表示的语义表达能力,从而生成更直观的图像表示。此外,我们在多模态特征融合过程中采用注意力机制,通过文本-图像交互获得文本感知的图像表示,以防止不相关的图像信息干扰我们的模型。实验表明,TISRI在两个ABMSA Twitter数据集上取得了有竞争力的结果,进而验证了我们方法的有效性。