College of Computer Science and Technology, Henan Institute of Technology, Xinxiang, Henan 453002, China.
College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan 453002, China.
Comput Intell Neurosci. 2022 Mar 28;2022:1354233. doi: 10.1155/2022/1354233. eCollection 2022.
Aiming at the lack of feature extraction ability of rumor detection methods based on the deep learning model, this study proposes a rumor detection method based on deep learning in social network big data environment. Firstly, the scheme of combining API interface and third-party crawler program is adopted to obtain Weibo rumor information from the Weibo "false Weibo information" public page, so as to obtain the Weibo dataset containing rumor information and nonrumor information. Secondly, the distributed word vector is used to encode text words, and the hierarchical Softmax and negative sampling are used to improve the training efficiency. Finally, a classification and detection model based on the combination of semantic features and statistical features is constructed, the memory function of Multi-BiLSTM is used to explore the dependency between data, and the statistical features are combined with semantic features to expand the feature space in rumor detection and describe the distribution of data in the feature space to a greater extent. Experiments show that when the word vector dimension is 300, compared with the compared literature, the accuracy of the proposed method is improved by 4.232% and 1.478%, respectively, and the F1 value of the proposed method is improved by 5.011% and 1.795%, respectively. The proposed method can better extract data features and has better rumor detection ability.
针对基于深度学习模型的谣言检测方法特征提取能力不足的问题,提出了一种在社交网络大数据环境下基于深度学习的谣言检测方法。首先,采用 API 接口和第三方爬虫程序相结合的方案,从微博“虚假微博信息”公共页面获取微博谣言信息,从而获取包含谣言信息和非谣言信息的微博数据集。其次,利用分布式词向量对文本单词进行编码,采用层次 Softmax 和负采样来提高训练效率。最后,构建基于语义特征和统计特征相结合的分类检测模型,利用 Multi-BiLSTM 的记忆功能挖掘数据之间的依存关系,结合统计特征扩展谣言检测中的特征空间,更充分地描述数据在特征空间中的分布。实验表明,在词向量维度为 300 时,与对比文献相比,该方法的准确率分别提高了 4.232%和 1.478%,F1 值分别提高了 5.011%和 1.795%。该方法能够更好地提取数据特征,具有更强的谣言检测能力。