Suppr超能文献

通过整合上下文感知注意力和融合网络进行讽刺检测的增强语义表示学习

Enhanced Semantic Representation Learning for Sarcasm Detection by Integrating Context-Aware Attention and Fusion Network.

作者信息

Hao Shufeng, Yao Jikun, Shi Chongyang, Zhou Yu, Xu Shuang, Li Dengao, Cheng Yinghan

机构信息

College of Data Science, Taiyuan University of Technology, Taiyuan 030024, China.

Key Laboratory of Big Data Fusion Analysis and Application of Shanxi Province, Taiyuan 030024, China.

出版信息

Entropy (Basel). 2023 May 30;25(6):878. doi: 10.3390/e25060878.

Abstract

Sarcasm is a sophisticated figurative language that is prevalent on social media platforms. Automatic sarcasm detection is significant for understanding the real sentiment tendencies of users. Traditional approaches mostly focus on content features by using lexicon, n-gram, and pragmatic feature-based models. However, these methods ignore the diverse contextual clues that could provide more evidence of the sarcastic nature of sentences. In this work, we propose a Contextual Sarcasm Detection Model (CSDM) by modeling enhanced semantic representations with user profiling and forum topic information, where context-aware attention and a user-forum fusion network are used to obtain diverse representations from distinct aspects. In particular, we employ a Bi-LSTM encoder with context-aware attention to obtain a refined comment representation by capturing sentence composition information and the corresponding context situations. Then, we employ a user-forum fusion network to obtain the comprehensive context representation by capturing the corresponding sarcastic tendencies of the user and the background knowledge about the comments. Our proposed method achieves values of 0.69, 0.70, and 0.83 in terms of accuracy on the Main balanced, Pol balanced and Pol imbalanced datasets, respectively. The experimental results on a large Reddit corpus, SARC, demonstrate that our proposed method achieves a significant performance improvement over state-of-art textual sarcasm detection methods.

摘要

讽刺是一种复杂的比喻性语言,在社交媒体平台上很普遍。自动讽刺检测对于理解用户的真实情感倾向具有重要意义。传统方法大多通过使用基于词典、n-gram和语用特征的模型来关注内容特征。然而,这些方法忽略了各种上下文线索,而这些线索可以为句子的讽刺性质提供更多证据。在这项工作中,我们通过使用用户画像和论坛主题信息对增强的语义表示进行建模,提出了一种上下文讽刺检测模型(CSDM),其中上下文感知注意力和用户-论坛融合网络用于从不同方面获得多样化的表示。具体来说,我们采用带有上下文感知注意力的双向长短期记忆(Bi-LSTM)编码器,通过捕捉句子组成信息和相应的上下文情况来获得精细的评论表示。然后,我们采用用户-论坛融合网络,通过捕捉用户相应的讽刺倾向和关于评论的背景知识来获得全面的上下文表示。我们提出的方法在主平衡数据集、政治平衡数据集和政治不平衡数据集上的准确率分别达到了0.69、0.70和0.83。在一个大型Reddit语料库SARC上的实验结果表明,我们提出的方法相对于现有的文本讽刺检测方法在性能上有显著提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5063/10297453/d4042078f4d5/entropy-25-00878-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验