一种使用深度学习的社交媒体多模态仇恨言论检测综合框架。

A comprehensive framework for multi-modal hate speech detection in social media using deep learning.

作者信息

Prabhu R, Seethalakshmi V

机构信息

Department of Information Technology, Dr. Mahalingam College of Engineering and Technology, Pollachi, India.

Department of Electronics and Communication Engineering, KPR Institute of Engineering and Technology, Coimbatore, India.

出版信息

Sci Rep. 2025 Apr 15;15(1):13020. doi: 10.1038/s41598-025-94069-z.

DOI:10.1038/s41598-025-94069-z

PMID:40234479

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12000576/

Abstract

As social media platforms evolve, hate speech increasingly manifests across multiple modalities, including text, images, audio, and video, challenging traditional detection systems focused on single modalities. Hence, this research proposes a novel Multi-modal Hate Speech Detection Framework (MHSDF) that combines Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to analyze complex, heterogeneous data streams. This hybrid approach leverages CNNs for spatial feature extraction, such as identifying visual cues in images and local text patterns, and Long Short Term Memory (LSTM) for modeling temporal dependencies and sequential information in text and audio. For textual content, utilize state-of-the-art word embeddings, including Word2Vec and BERT, to capture semantic relationships and contextual nuances. The framework integrates CNNs to extract n-gram patterns and RNNs to model long-range dependency up to sequences of up to 100 tokens. CNNs extract key spatial features in visual tasks, while LSTMs process video sequences to capture evolving visual patterns. Image spatial features refer to object localization, color distributions, and text extracted via Optical Character Recognition (OCR). The fusion mechanism employs attention mechanisms to prioritize key interactions between modalities, enabling the detection of nuanced hate speech, such as memes that blend offensive imagery with implicit text, sarcastic videos where toxicity is conveyed through tone and facial expressions, and multi-layered content that embeds discriminatory meaning, across different formats. The numerical findings show that the proposed MHSDF model increases the detection accuracy ratio of 98.53%, robustness ratio of 97.64%, interpretability ratio of 97.71%, scalability ratio of 98.67%, and performance ratio of 99.21% compared to other existing models. Furthermore, the model's interpretability is enhanced through attention-based explanations, which provide insights into how multi-modal hate speech is identified. The framework improves traceability of decisions, interpretability by modality, and overall transparency.

摘要

随着社交媒体平台的发展，仇恨言论越来越多地以多种形式表现出来，包括文本、图像、音频和视频，这对专注于单一形式的传统检测系统构成了挑战。因此，本研究提出了一种新颖的多模态仇恨言论检测框架（MHSDF），该框架结合了卷积神经网络（CNN）和循环神经网络（RNN）来分析复杂的异构数据流。这种混合方法利用CNN进行空间特征提取，例如识别图像中的视觉线索和局部文本模式，并利用长短期记忆（LSTM）对文本和音频中的时间依赖性和序列信息进行建模。对于文本内容，利用包括Word2Vec和BERT在内的最新词嵌入来捕捉语义关系和上下文细微差别。该框架集成了CNN以提取n元语法模式，并集成了RNN以对长达100个词元的序列进行长距离依赖性建模。CNN在视觉任务中提取关键空间特征，而LSTM处理视频序列以捕捉不断演变的视觉模式。图像空间特征是指目标定位、颜色分布以及通过光学字符识别（OCR）提取的文本。融合机制采用注意力机制来优先处理不同模态之间的关键交互，从而能够检测细微的仇恨言论，例如将冒犯性图像与隐含文本混合的表情包、通过语气和面部表情传达毒性的讽刺视频，以及嵌入歧视性含义的多层内容，这些内容跨越不同格式。数值结果表明，与其他现有模型相比，所提出的MHSDF模型的检测准确率提高了98.53%，鲁棒性提高了97.64%，可解释性提高了97.71%，可扩展性提高了98.67%，性能提高了99.21%。此外，通过基于注意力的解释增强了模型的可解释性，这些解释提供了关于如何识别多模态仇恨言论的见解。该框架提高了决策的可追溯性、按模态的可解释性以及整体透明度。