Suppr超能文献

SemSeq4FD:整合全局语义关系和局部顺序以增强用于假新闻检测的文本表示

SemSeq4FD: Integrating global semantic relationship and local sequential order to enhance text representation for fake news detection.

作者信息

Wang Yuhang, Wang Li, Yang Yanjie, Lian Tao

机构信息

Data Science College, Taiyuan University of Technology, Jinzhong, Shanxi, 030600, China.

出版信息

Expert Syst Appl. 2021 Mar 15;166:114090. doi: 10.1016/j.eswa.2020.114090. Epub 2020 Oct 3.

Abstract

The wide spread of fake news has caused huge losses to both governments and the public. Many existing works on fake news detection utilized spreading information like propagators profiles and the propagation structure. However, such methods face the difficulty of data collection and cannot detect fake news at the early stage. An alternative approach is to detect fake news solely based on its content. Early content-based methods rely on manually designed linguistic features. Such shallow features are domain-dependent, and cannot easily be generalized to cross-domain data. Recently, many natural language processing tasks resort to deep learning methods to learn word, sentence, and document representations. In this paper, we propose a novel graph-based neural network model named SemSeq4FD for early fake news detection based on enhanced text representations. In SemSeq4FD, we model the global pair-wise semantic relations between sentences as a complete graph, and learn the global sentence representations via a graph convolutional network with self-attention mechanism. Considering the importance of local context in conveying the sentence meaning, we employ a 1D convolutional network to learn the local sentence representations. The two representations are combined to form the enhanced sentence representations. Then a LSTM-based network is used to model the sequence of enhanced sentence representations, yielding the final document representation for fake news detection. Experiments conducted on four real-world datasets in English and Chinese, including cross-source and cross-domain datasets, demonstrate that our model can outperform the state-of-the-art methods.

摘要

虚假新闻的广泛传播给政府和公众都造成了巨大损失。许多现有的虚假新闻检测工作利用传播者简介和传播结构等传播信息。然而,这些方法面临数据收集的困难,并且无法在早期阶段检测到虚假新闻。另一种方法是仅基于虚假新闻的内容进行检测。早期基于内容的方法依赖于人工设计的语言特征。这种浅层特征依赖于领域,并且不容易推广到跨领域数据。最近,许多自然语言处理任务采用深度学习方法来学习单词、句子和文档表示。在本文中,我们提出了一种名为SemSeq4FD的基于图的新型神经网络模型,用于基于增强文本表示的早期虚假新闻检测。在SemSeq4FD中,我们将句子之间的全局成对语义关系建模为一个完全图,并通过具有自注意力机制的图卷积网络学习全局句子表示。考虑到局部上下文在传达句子含义中的重要性,我们采用一维卷积网络来学习局部句子表示。将这两种表示结合起来形成增强的句子表示。然后使用基于长短期记忆网络(LSTM)的网络对增强句子表示的序列进行建模,生成用于虚假新闻检测的最终文档表示。在包括跨源和跨领域数据集在内的四个英文和中文真实世界数据集上进行的实验表明,我们的模型优于现有最先进的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef88/7532792/bb41c1afeb41/gr1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验