Elhadad Mohamed K, Li Kin Fun, Gebali Fayez
Department of Electrical and Computer EngineeringUniversity of Victoria Victoria V8W 2Y2 Canada.
IEEE Access. 2020 Sep 9;8:165201-165215. doi: 10.1109/ACCESS.2020.3022867. eCollection 2020.
This article addresses the problem of detecting misleading information related to COVID-19. We propose a misleading-information detection model that relies on the World Health Organization, UNICEF, and the United Nations as sources of information, as well as epidemiological material collected from a range of fact-checking websites. Obtaining data from reliable sources should assure their validity. We use this collected ground-truth data to build a detection system that uses machine learning to identify misleading information. Ten machine learning algorithms, with seven feature extraction techniques, are used to construct a voting ensemble machine learning classifier. We perform 5-fold cross-validation to check the validity of the collected data and report the evaluation of twelve performance metrics. The evaluation results indicate the quality and validity of the collected ground-truth data and their effectiveness in constructing models to detect misleading information.
本文探讨了检测与COVID-19相关的误导性信息的问题。我们提出了一种误导性信息检测模型,该模型依赖世界卫生组织、联合国儿童基金会和联合国作为信息来源,以及从一系列事实核查网站收集的流行病学材料。从可靠来源获取数据应能确保其有效性。我们使用这些收集到的真实数据构建一个检测系统,该系统利用机器学习来识别误导性信息。十种机器学习算法与七种特征提取技术被用于构建一个投票集成机器学习分类器。我们进行5折交叉验证以检查所收集数据的有效性,并报告十二个性能指标的评估结果。评估结果表明了所收集的真实数据的质量和有效性,以及它们在构建检测误导性信息模型方面的有效性。