High-Tech Institute of Xi'an, Xi'an, 710025, Shaanxi, China.
Sci Rep. 2022 Jul 21;12(1):12458. doi: 10.1038/s41598-022-16576-7.
Spammer detection is essentially a process of judging the authenticity of users, and thus can be regarded as a classification problem. In order to improve the classification performance, multi-classifier information fusion is usually used to realize the automatic detection of spammers by utilizing the information from multiple classifiers. However, the existing fusion strategies do not reasonably take the uncertainty from the results of different classifiers (views) into account, and the relative importance and reliability of each classifier are not strictly distinguished. Therefore, in order to detect spammers effectively, this paper develops a novel multi-classifier information fusion model based on the evidential reasoning (ER) rule. Firstly, according to the user's characterization strategy, the base classifiers are constructed through the profile-based, content-based and behavior-based. Then, the idea of multi-classifier fusion is combined with the ER rule, and the results of base classifiers are aggregated by considering their weights and reliabilities. Extensive experimental results on the real-world dataset verify the effectiveness of the proposed model.
垃圾邮件发送者检测本质上是一个判断用户真实性的过程,因此可以看作是一个分类问题。为了提高分类性能,通常使用多分类器信息融合技术,通过利用来自多个分类器的信息,实现对垃圾邮件发送者的自动检测。然而,现有的融合策略并没有合理考虑来自不同分类器(视图)的结果的不确定性,也没有严格区分每个分类器的相对重要性和可靠性。因此,为了有效地检测垃圾邮件发送者,本文基于证据推理(ER)规则开发了一种新的多分类器信息融合模型。首先,根据用户的特征化策略,通过基于配置文件、基于内容和基于行为的方法构建基础分类器。然后,将多分类器融合的思想与 ER 规则相结合,通过考虑其权重和可靠性来对基础分类器的结果进行聚合。在真实数据集上的广泛实验结果验证了所提出模型的有效性。