深度卷积森林：一种用于文本中垃圾邮件检测的动态深度集成方法。

Deep convolutional forest: a dynamic deep ensemble approach for spam detection in text.

作者信息

Shaaban Mai A, Hassan Yasser F, Guirguis Shawkat K

机构信息

Department of Mathematics and Computer Science, Faculty of Science, Alexandria University, Alexandria, Egypt.

Faculty of Computers and Data Science, Alexandria University, Alexandria, Egypt.

出版信息

Complex Intell Systems. 2022;8(6):4897-4909. doi: 10.1007/s40747-022-00741-6. Epub 2022 Apr 26.

DOI:10.1007/s40747-022-00741-6

PMID:35496326

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9039275/

Abstract

The increase in people's use of mobile messaging services has led to the spread of social engineering attacks like phishing, considering that spam text is one of the main factors in the dissemination of phishing attacks to steal sensitive data such as credit cards and passwords. In addition, rumors and incorrect medical information regarding the COVID-19 pandemic are widely shared on social media leading to people's fear and confusion. Thus, filtering spam content is vital to reduce risks and threats. Previous studies relied on machine learning and deep learning approaches for spam classification, but these approaches have two limitations. Machine learning models require manual feature engineering, whereas deep neural networks require a high computational cost. This paper introduces a dynamic deep ensemble model for spam detection that adjusts its complexity and extracts features automatically. The proposed model utilizes convolutional and pooling layers for feature extraction along with base classifiers such as random forests and extremely randomized trees for classifying texts into spam or legitimate ones. Moreover, the model employs ensemble learning procedures like boosting and bagging. As a result, the model achieved high precision, recall, f1-score and accuracy of 98.38%.

摘要

人们对移动消息服务使用的增加导致了网络钓鱼等社会工程攻击的传播，因为垃圾短信是传播网络钓鱼攻击以窃取信用卡和密码等敏感数据的主要因素之一。此外，关于新冠疫情的谣言和错误医疗信息在社交媒体上广泛传播，导致人们恐惧和困惑。因此，过滤垃圾内容对于降低风险和威胁至关重要。以往的研究依赖机器学习和深度学习方法进行垃圾邮件分类，但这些方法有两个局限性。机器学习模型需要人工进行特征工程，而深度神经网络需要高昂的计算成本。本文介绍了一种用于垃圾邮件检测的动态深度集成模型，该模型可自动调整其复杂度并提取特征。所提出的模型利用卷积层和池化层进行特征提取，并使用随机森林和极端随机树等基础分类器将文本分类为垃圾邮件或合法邮件。此外，该模型采用了提升和装袋等集成学习过程。结果，该模型实现了98.38%的高精度、召回率、F1分数和准确率。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

深度卷积森林：一种用于文本中垃圾邮件检测的动态深度集成方法。

Deep convolutional forest: a dynamic deep ensemble approach for spam detection in text.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

深度卷积森林：一种用于文本中垃圾邮件检测的动态深度集成方法。

Deep convolutional forest: a dynamic deep ensemble approach for spam detection in text.

作者信息

机构信息

出版信息

相似文献

引用本文的文献