Abdalla Hemn Barzan, Ahmed Awder M, Zeebaree Subhi R M, Alkhayyat Ahmed, Ihnaini Baha
Department of Computer Science, Wenzhou-Kean University, Wenzhou, Zhejiang, China.
Department of Communication Engineering, Technical College of Engineering, Sulaimani Polytechnic University, Sulaymaniyah, Iraq.
PeerJ Comput Sci. 2022 Mar 31;8:e937. doi: 10.7717/peerj-cs.937. eCollection 2022.
Increasing demands for information and the rapid growth of big data have dramatically increased the amount of textual data. In order to obtain useful text information, the classification of texts is considered an imperative task. Accordingly, this article will describe the development of a hybrid optimization algorithm for classifying text. Here, pre-processing was done using the stemming process and stop word removal. Additionally, we performed the extraction of imperative features and the selection of optimal features using the Tanimoto similarity, which estimates the similarity between features and selects the relevant features with higher feature selection accuracy. Following that, a deep residual network trained by the Adam algorithm was utilized for dynamic text classification. Dynamic learning was performed using the proposed Rider invasive weed optimization (RIWO)-based deep residual network along with fuzzy theory. The proposed RIWO algorithm combines invasive weed optimization (IWO) and the Rider optimization algorithm (ROA). These processes are carried out under the MapReduce framework. Our analysis revealed that the proposed RIWO-based deep residual network outperformed other techniques with the highest true positive rate (TPR) of 85%, true negative rate (TNR) of 94%, and accuracy of 88.7%.
对信息需求的不断增加以及大数据的快速增长极大地增加了文本数据量。为了获取有用的文本信息,文本分类被视为一项紧迫任务。因此,本文将描述一种用于文本分类的混合优化算法的开发。在这里,预处理是通过词干提取过程和停用词去除来完成的。此外,我们使用谷本相似度进行了祈使特征提取和最优特征选择,谷本相似度估计特征之间的相似度并以更高的特征选择精度选择相关特征。在此之后,利用由Adam算法训练的深度残差网络进行动态文本分类。使用所提出的基于骑手入侵杂草优化(RIWO)的深度残差网络以及模糊理论进行动态学习。所提出的RIWO算法结合了入侵杂草优化(IWO)和骑手优化算法(ROA)。这些过程在MapReduce框架下进行。我们的分析表明,所提出的基于RIWO的深度残差网络优于其他技术,其真阳性率(TPR)最高为85%,真阴性率(TNR)为94%,准确率为88.7%。