Albtoush Eman Salamah, Gan Keng Hoon, Alrababa Saif A Ahmad
School of Computer Sciences, Universiti Sains Malaysia, Gelugor, Malaysia.
Faculty of Information Technology, Al al-Bayt University, Mafraq, Jordan.
PeerJ Comput Sci. 2025 Mar 11;11:e2693. doi: 10.7717/peerj-cs.2693. eCollection 2025.
The proliferation of fake news has become a significant threat, influencing individuals, institutions, and societies at large. This issue has been exacerbated by the pervasive integration of social media into daily life, directly shaping opinions, trends, and even the economies of nations. Social media platforms have struggled to mitigate the effects of fake news, relying primarily on traditional methods based on human expertise and knowledge. Consequently, machine learning (ML) and deep learning (DL) techniques now play a critical role in distinguishing fake news, necessitating their extensive deployment to counter the rapid spread of misinformation across all languages, particularly Arabic. Detecting fake news in Arabic presents unique challenges, including complex grammar, diverse dialects, and the scarcity of annotated datasets, along with a lack of research in the field of fake news detection compared to English. This study provides a comprehensive review of fake news, examining its types, domains, characteristics, life cycle, and detection approaches. It further explores recent advancements in research leveraging ML, DL, and transformer-based techniques for fake news detection, with a special attention to Arabic. The research delves into Arabic-specific pre-processing techniques, methodologies tailored for fake news detection in the language, and the datasets employed in these studies. Additionally, it outlines future research directions aimed at developing more effective and robust strategies to address the challenge of fake news detection in Arabic content.
假新闻的泛滥已成为一个重大威胁,影响着个人、机构乃至整个社会。社交媒体在日常生活中的广泛融入加剧了这一问题,它直接塑造着观点、潮流,甚至国家的经济。社交媒体平台一直在努力减轻假新闻的影响,主要依靠基于人类专业知识的传统方法。因此,机器学习(ML)和深度学习(DL)技术如今在辨别假新闻方面发挥着关键作用,需要广泛部署这些技术来应对错误信息在所有语言中迅速传播的情况,尤其是阿拉伯语。检测阿拉伯语假新闻存在独特的挑战,包括复杂的语法、多样的方言、标注数据集的匮乏,以及与英语相比在假新闻检测领域缺乏研究。本研究对假新闻进行了全面综述,考察了其类型、领域、特征、生命周期和检测方法。它还进一步探讨了利用机器学习、深度学习和基于Transformer的技术进行假新闻检测的最新研究进展,特别关注阿拉伯语。该研究深入探讨了针对阿拉伯语的预处理技术、为该语言的假新闻检测量身定制的方法,以及这些研究中使用的数据集。此外,它还概述了未来的研究方向,旨在制定更有效、更强大的策略来应对阿拉伯语内容中假新闻检测的挑战。