多模态仇恨言论检测：一种用于多语言文本和图像的新型深度学习框架。

Multimodal hate speech detection: a novel deep learning framework for multilingual text and images.

作者信息

Saddozai Furqan Khan, Badri Sahar K, Alghazzawi Daniyal, Khattak Asad, Asghar Muhammad Zubair

机构信息

Gomal Research Institute of Computing, Faculty of Computing, Gomal University, D.I.Khan, KP, Pakistan.

Information Systems Department, Faculty of Computing and Information Technology, King Abdul Aziz University, Jeddah, Saudi Arabia.

出版信息

PeerJ Comput Sci. 2025 Apr 16;11:e2801. doi: 10.7717/peerj-cs.2801. eCollection 2025.

DOI:10.7717/peerj-cs.2801

PMID:40567705

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12190340/

Abstract

The rapid proliferation of social media platforms has facilitated the expression of opinions but also enabled the spread of hate speech. Detecting multimodal hate speech in low-resource multilingual contexts poses significant challenges. This study presents a deep learning framework that integrates bidirectional long short-term memory (BiLSTM) and EfficientNetB1 to classify hate speech in Urdu-English tweets, leveraging both text and image modalities. We introduce multimodal multilingual hate speech (MMHS11K), a manually annotated dataset comprising 11,000 multimodal tweets. Using an early fusion strategy, text and image features were combined for classification. Experimental results demonstrate that the BiLSTM+EfficientNetB1 model outperforms unimodal and baseline multimodal approaches, achieving an F1-score of 81.2% for Urdu tweets and 75.5% for English tweets. This research addresses critical gaps in multilingual and multimodal hate speech detection, offering a foundation for future advancements.

摘要

社交媒体平台的迅速扩散既促进了观点的表达，但也使得仇恨言论得以传播。在资源匮乏的多语言环境中检测多模态仇恨言论面临着重大挑战。本研究提出了一个深度学习框架，该框架整合了双向长短期记忆（BiLSTM）和高效神经网络B1（EfficientNetB1），以利用文本和图像模态对乌尔都语-英语推文中的仇恨言论进行分类。我们引入了多模态多语言仇恨言论（MMHS11K），这是一个包含11000条多模态推文的人工标注数据集。使用早期融合策略，将文本和图像特征结合起来进行分类。实验结果表明，BiLSTM+EfficientNetB1模型优于单模态和基线多模态方法，乌尔都语推文的F1分数达到81.2%，英语推文的F1分数达到75.5%。本研究解决了多语言和多模态仇恨言论检测中的关键空白，为未来的进展奠定了基础。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

多模态仇恨言论检测：一种用于多语言文本和图像的新型深度学习框架。

Multimodal hate speech detection: a novel deep learning framework for multilingual text and images.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

多模态仇恨言论检测：一种用于多语言文本和图像的新型深度学习框架。

Multimodal hate speech detection: a novel deep learning framework for multilingual text and images.

作者信息

机构信息

出版信息

相似文献

本文引用的文献