Fu Lifang, Peng Huanxin, Liu Shuai
Northeast Agricultural University, Harbin, 150030 China.
School of Engineering, Northeast Agricultural University, Harbin, 150030 China.
J Supercomput. 2023 May 15:1-28. doi: 10.1007/s11227-023-05381-2.
The widespread dissemination of fake news on social media brings adverse effects on the public and social development. Most existing techniques are limited to a single domain (e.g., medicine or politics) to identify fake news. However, many differences exist commonly across domains, such as word usage, which lead to those methods performing poorly in other domains. In the real world, social media releases millions of news pieces in diverse domains every day. Therefore, it is of significant practical importance to propose a fake news detection model that can be applied to multiple domains. In this paper, we propose a novel framework based on knowledge graphs (KG) for multi-domain fake news detection, named KG-MFEND. The model's performance is enhanced by improving the BERT and integrating external knowledge to alleviate domain differences at the word level. Specifically, we construct a new KG that encompasses multi-domain knowledge and injects entity triples to build a sentence tree to enrich the news background knowledge. To solve the problem of embedding space and knowledge noise, we use the soft position and visible matrix in knowledge embedding. To reduce the influence of label noise, we add label smoothing to the training. Extensive experiments are conducted on real Chinese datasets. And the results show that KG-MFEND has a strong generalization capability in single, mixed, and multiple domains and outperforms the current state-of-the-art methods for multi-domain fake news detection.
社交媒体上虚假新闻的广泛传播给公众和社会发展带来了不利影响。大多数现有技术仅限于单个领域(如医学或政治)来识别虚假新闻。然而,不同领域通常存在许多差异,如词汇使用等,这导致这些方法在其他领域表现不佳。在现实世界中,社交媒体每天都会发布数百万条来自不同领域的新闻。因此,提出一种可应用于多个领域的虚假新闻检测模型具有重要的实际意义。在本文中,我们提出了一种基于知识图谱(KG)的用于多领域虚假新闻检测的新颖框架,名为KG-MFEND。该模型通过改进BERT并整合外部知识以减轻词汇层面的领域差异来提高性能。具体而言,我们构建了一个包含多领域知识的新KG,并注入实体三元组来构建句子树以丰富新闻背景知识。为了解决嵌入空间和知识噪声问题,我们在知识嵌入中使用软位置和可见矩阵。为了减少标签噪声的影响,我们在训练中添加标签平滑。在真实中文数据集上进行了广泛实验。结果表明,KG-MFEND在单领域、混合领域和多领域中具有强大的泛化能力,并且在多领域虚假新闻检测方面优于当前的最先进方法。