• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AFND:用于检测和分类文章可信度的阿拉伯语虚假新闻数据集。

AFND: Arabic fake news dataset for the detection and classification of articles credibility.

作者信息

Khalil Ashwaq, Jarrah Moath, Aldwairi Monther, Jaradat Manar

机构信息

Department of Computer Engineering, Jordan University of Science and Technology, PO Box 3030, Irbid 22110, Jordan.

College of Technological Innovation, Zayed University, Abu Dhabi, UAE.

出版信息

Data Brief. 2022 Apr 8;42:108141. doi: 10.1016/j.dib.2022.108141. eCollection 2022 Jun.

DOI:10.1016/j.dib.2022.108141
PMID:35496492
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9048144/
Abstract

The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is used manually to classify each public news source into credible, not credible, or undecided. Weak supervision is applied to label news articles with the same label as the public source. AFND is imbalanced in the number of articles in each class. Hence, it is useful for researchers who focus on finding solutions for imbalanced datasets. The dataset is available in JSON format and can be accessed from Mendeley Data repository.

摘要

由于不同社交媒体平台上新闻的迅速增加,新闻可信度检测任务最近开始受到更多关注。本文提供了一个从公共阿拉伯语新闻网站收集的大型、有标签且多样化的阿拉伯语假新闻数据集(AFND)。该数据集使研究社区能够使用监督和无监督机器学习算法对阿拉伯语新闻文章的可信度进行分类。AFND由606912篇公共新闻文章组成,这些文章是在6个月内使用Python脚本从19个不同阿拉伯国家的134个公共新闻网站上抓取的。阿拉伯语事实核查平台Misbar被手动用于将每个公共新闻来源分类为可信、不可信或不确定。采用弱监督将新闻文章标记为与公共来源相同的标签。AFND在每个类别的文章数量上是不均衡的。因此,它对专注于为不均衡数据集寻找解决方案的研究人员很有用。该数据集以JSON格式提供,可以从Mendeley数据存储库访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/5284b47d8fd0/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/f766dafbc4c8/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/34afd220dbd9/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/c1c3d178e7cf/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/d69466722c0d/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/5284b47d8fd0/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/f766dafbc4c8/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/34afd220dbd9/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/c1c3d178e7cf/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/d69466722c0d/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa0/9048144/5284b47d8fd0/gr5.jpg

相似文献

1
AFND: Arabic fake news dataset for the detection and classification of articles credibility.AFND:用于检测和分类文章可信度的阿拉伯语虚假新闻数据集。
Data Brief. 2022 Apr 8;42:108141. doi: 10.1016/j.dib.2022.108141. eCollection 2022 Jun.
2
Arabic Fake News Detection Based on Textual Analysis.基于文本分析的阿拉伯语假新闻检测
Arab J Sci Eng. 2022;47(8):10453-10469. doi: 10.1007/s13369-021-06449-y. Epub 2022 Feb 11.
3
SANAD: Single-label Arabic News Articles Dataset for automatic text categorization.SANAD:用于自动文本分类的单标签阿拉伯语新闻文章数据集。
Data Brief. 2019 Jun 4;25:104076. doi: 10.1016/j.dib.2019.104076. eCollection 2019 Aug.
4
Arabic fake news detection based on deep contextualized embedding models.基于深度上下文嵌入模型的阿拉伯语假新闻检测
Neural Comput Appl. 2022;34(18):16019-16032. doi: 10.1007/s00521-022-07206-4. Epub 2022 May 3.
5
The role of analytical reasoning and source credibility on the evaluation of real and fake full-length news articles.分析推理和来源可信度对真实和虚假完整新闻文章评估的作用。
Cogn Res Princ Implic. 2021 Mar 31;6(1):24. doi: 10.1186/s41235-021-00292-3.
6
ANAD: Arabic news article dataset.ANAD:阿拉伯语新闻文章数据集。
Data Brief. 2023 Jul 29;50:109460. doi: 10.1016/j.dib.2023.109460. eCollection 2023 Oct.
7
Dataset for multimodal fake news detection and verification tasks.用于多模态假新闻检测与验证任务的数据集。
Data Brief. 2024 Apr 16;54:110440. doi: 10.1016/j.dib.2024.110440. eCollection 2024 Jun.
8
Enhancing the Predictive Performance of Credibility-Based Fake News Detection Using Ensemble Learning.使用集成学习提高基于可信度的假新闻检测的预测性能。
Rev Socionetwork Strateg. 2022;16(2):259-289. doi: 10.1007/s12626-022-00127-7. Epub 2022 Sep 17.
9
EchoFakeD: improving fake news detection in social media with an efficient deep neural network.回声假新闻检测(EchoFakeD):利用高效深度神经网络改进社交媒体中的假新闻检测
Neural Comput Appl. 2021;33(14):8597-8613. doi: 10.1007/s00521-020-05611-1. Epub 2021 Jan 2.
10
Stance detection with BERT embeddings for credibility analysis of information on social media.基于BERT嵌入的立场检测用于社交媒体信息可信度分析
PeerJ Comput Sci. 2021 Apr 14;7:e467. doi: 10.7717/peerj-cs.467. eCollection 2021.

引用本文的文献

1
VERA-ARAB: unveiling the Arabic tweets credibility by constructing balanced news dataset for veracity analysis.VERA-ARAB:通过构建用于真实性分析的平衡新闻数据集来揭示阿拉伯语推文的可信度。
PeerJ Comput Sci. 2024 Oct 30;10:e2432. doi: 10.7717/peerj-cs.2432. eCollection 2024.
2
Ensemble based high performance deep learning models for fake news detection.基于集成的用于假新闻检测的高性能深度学习模型。
Sci Rep. 2024 Nov 4;14(1):26591. doi: 10.1038/s41598-024-76286-0.