Kotiyal Bina, Pathak Heman, Singh Nipur
Department of Computer Science, Gurukula Kangri (Deemed to be University), Haridwar, Uttarakhand India.
Int J Inf Technol. 2023 Jun 4:1-13. doi: 10.1007/s41870-023-01288-6.
Fake news on social media has become a growing concern due to its potential impact on shaping public opinion. The proposed Debunking Multi-Lingual Social Media Posts using Deep Learning (DSMPD) approach offers a promising solution to detect fake news. The DSMPD approach involves creating a dataset of English and Hindi social media posts using web scraping and Natural Language Processing (NLP) techniques. This dataset is then used to train, test, and validate a deep learning-based model that extracts various features, including Embedding from Language Models (ELMo), word and n-gram counts, Term Frequency-Inverse Document Frequency (TF-IDF), sentiments, polarity, and Named Entity Recognition (NER). Based on these features, the model classifies news items into five categories: real, could be real, could be fabricated, fabricated, or dangerously fabricated. To evaluate the performance of the classifiers, the researchers used two datasets comprising over 45,000 articles. Machine learning (ML) algorithms and Deep learning (DL) model are compared to choose the best option for classification and prediction.
由于社交媒体上的虚假新闻对塑造公众舆论有潜在影响,它已成为一个日益受到关注的问题。提议的使用深度学习揭穿多语言社交媒体帖子(DSMPD)方法为检测虚假新闻提供了一个有前景的解决方案。DSMPD方法包括使用网络爬虫和自然语言处理(NLP)技术创建一个包含英语和印地语社交媒体帖子的数据集。然后,该数据集用于训练、测试和验证一个基于深度学习的模型,该模型提取各种特征,包括来自语言模型的嵌入(ELMo)、单词和n元语法计数、词频-逆文档频率(TF-IDF)、情感、极性和命名实体识别(NER)。基于这些特征,该模型将新闻项目分为五类:真实、可能真实、可能是编造的、编造的或危险编造的。为了评估分类器的性能,研究人员使用了两个包含超过45000篇文章的数据集。比较了机器学习(ML)算法和深度学习(DL)模型,以选择用于分类和预测的最佳选项。