基于方面的多模态情感分析的文本-图像语义相关性识别

Text-image semantic relevance identification for aspect-based multimodal sentiment analysis.

作者信息

Zhang Tianzhi, Zhou Gang, Lu Jicang, Li Zhibo, Wu Hao, Liu Shuo

机构信息

Information Engineering University, Zhengzhou, Henan, China.

出版信息

PeerJ Comput Sci. 2024 Apr 12;10:e1904. doi: 10.7717/peerj-cs.1904. eCollection 2024.

DOI:10.7717/peerj-cs.1904

PMID:39669471

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11636758/

Abstract

Aspect-based multimodal sentiment analysis (ABMSA) is an emerging task in the research of multimodal sentiment analysis, which aims to identify the sentiment of each aspect mentioned in multimodal sample. Although recent research on ABMSA has achieved some success, most existing models only adopt attention mechanism to interact aspect with text and image respectively and obtain sentiment output through multimodal concatenation, they often neglect to consider that some samples may not have semantic relevance between text and image. In this article, we propose a Text-Image Semantic Relevance Identification (TISRI) model for ABMSA to address the problem. Specifically, we introduce a multimodal feature relevance identification module to calculate the semantic similarity between text and image, and then construct an image gate to dynamically control the input image information. On this basis, an image auxiliary information is provided to enhance the semantic expression ability of visual feature representation to generate more intuitive image representation. Furthermore, we employ attention mechanism during multimodal feature fusion to obtain the text-aware image representation through text-image interaction to prevent irrelevant image information interfering our model. Experiments demonstrate that TISRI achieves competitive results on two ABMSA Twitter datasets, and then validate the effectiveness of our methods.

摘要

基于方面的多模态情感分析（ABMSA）是多模态情感分析研究中的一个新兴任务，旨在识别多模态样本中提及的各个方面的情感。尽管最近关于ABMSA的研究取得了一些成功，但大多数现有模型仅采用注意力机制分别将方面与文本和图像进行交互，并通过多模态拼接获得情感输出，它们常常忽略考虑一些样本的文本和图像之间可能不存在语义相关性。在本文中，我们提出了一种用于ABMSA的文本-图像语义相关性识别（TISRI）模型来解决该问题。具体而言，我们引入了一个多模态特征相关性识别模块来计算文本和图像之间的语义相似度，然后构建一个图像门来动态控制输入的图像信息。在此基础上，提供图像辅助信息以增强视觉特征表示的语义表达能力，从而生成更直观的图像表示。此外，我们在多模态特征融合过程中采用注意力机制，通过文本-图像交互获得文本感知的图像表示，以防止不相关的图像信息干扰我们的模型。实验表明，TISRI在两个ABMSA Twitter数据集上取得了有竞争力的结果，进而验证了我们方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/40d8/11636758/cfd3ff5dd2fc/peerj-cs-10-1904-g001.jpg

相似文献

Text-image semantic relevance identification for aspect-based multimodal sentiment analysis.基于方面的多模态情感分析的文本-图像语义相关性识别

PeerJ Comput Sci. 2024 Apr 12;10:e1904. doi: 10.7717/peerj-cs.1904. eCollection 2024.

Text-in-Image Enhanced Self-Supervised Alignment Model for Aspect-Based Multimodal Sentiment Analysis on Social Media.用于社交媒体上基于方面的多模态情感分析的文本图像增强自监督对齐模型

Sensors (Basel). 2025 Apr 17;25(8):2553. doi: 10.3390/s25082553.

Self-adaptive attention fusion for multimodal aspect-based sentiment analysis.用于多模态方面情感分析的自适应注意力融合

Math Biosci Eng. 2024 Jan;21(1):1305-1320. doi: 10.3934/mbe.2024056. Epub 2022 Dec 27.

Hierarchical Fusion Network with Enhanced Knowledge and Contrastive Learning for Multimodal Aspect-Based Sentiment Analysis on Social Media.基于增强知识和对比学习的层次融合网络用于社交媒体上基于多模态方面的情感分析

Sensors (Basel). 2023 Aug 22;23(17):7330. doi: 10.3390/s23177330.

AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model.AFR-BERT：基于注意力机制的特征相关融合多模态情感分析模型。

PLoS One. 2022 Sep 9;17(9):e0273936. doi: 10.1371/journal.pone.0273936. eCollection 2022.

MIECF: Multi-faceted information extraction and cross-mixture fusion for multimodal aspect-based sentiment analysis.MIECF：用于多模态方面情感分析的多层面信息提取与交叉混合融合

Heliyon. 2024 Jun 14;10(12):e32967. doi: 10.1016/j.heliyon.2024.e32967. eCollection 2024 Jun 30.

VisdaNet: Visual Distillation and Attention Network for Multimodal Sentiment Classification.VisdaNet：用于多模态情感分类的视觉蒸馏与注意力网络

Sensors (Basel). 2023 Jan 6;23(2):661. doi: 10.3390/s23020661.

Semantic enhancement and cross-modal interaction fusion for sentiment analysis in social media.社交媒体中用于情感分析的语义增强与跨模态交互融合

PLoS One. 2025 Apr 28;20(4):e0321011. doi: 10.1371/journal.pone.0321011. eCollection 2025.

Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks.基于跨模态注意力和门控循环层次融合网络的多模态情感分析。

Comput Intell Neurosci. 2022 Aug 9;2022:4767437. doi: 10.1155/2022/4767437. eCollection 2022.

AB-GRU: An attention-based bidirectional GRU model for multimodal sentiment fusion and analysis.AB-GRU：一种用于多模态情感融合与分析的基于注意力机制的双向门控循环单元模型。

Math Biosci Eng. 2023 Sep 27;20(10):18523-18544. doi: 10.3934/mbe.2023822.

本文引用的文献

Performance analysis of aspect-level sentiment classification task based on different deep learning models.基于不同深度学习模型的方面级情感分类任务性能分析

PeerJ Comput Sci. 2023 Oct 9;9:e1578. doi: 10.7717/peerj-cs.1578. eCollection 2023.

Deep learning for aspect-based sentiment analysis: a review.基于方面的情感分析的深度学习综述

PeerJ Comput Sci. 2022 Jul 19;8:e1044. doi: 10.7717/peerj-cs.1044. eCollection 2022.

Multimodal Transformer for Unaligned Multimodal Language Sequences.用于未对齐多模态语言序列的多模态变换器

Proc Conf Assoc Comput Linguist Meet. 2019 Jul;2019:6558-6569. doi: 10.18653/v1/p19-1656.

Long short-term memory.长短期记忆

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于方面的多模态情感分析的文本-图像语义相关性识别

Text-image semantic relevance identification for aspect-based multimodal sentiment analysis.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献