Houssein Essam H, Hosney Mosa E, Mohamed Waleed M, Ali Abdelmgeid A, Younis Eman M G
Faculty of Computers and Information, Minia University, Minia, Egypt.
Faculty of Computers and Information, Luxor University, Luxor, Egypt.
Neural Comput Appl. 2023;35(7):5251-5275. doi: 10.1007/s00521-022-07916-9. Epub 2022 Nov 1.
Feature selection (FS) is one of the basic data preprocessing steps in data mining and machine learning. It is used to reduce feature size and increase model generalization. In addition to minimizing feature dimensionality, it also enhances classification accuracy and reduces model complexity, which are essential in several applications. Traditional methods for feature selection often fail in the optimal global solution due to the large search space. Many hybrid techniques have been proposed depending on merging several search strategies which have been used individually as a solution to the FS problem. This study proposes a modified hunger games search algorithm (mHGS), for solving optimization and FS problems. The main advantages of the proposed mHGS are to resolve the following drawbacks that have been raised in the original HGS; (1) avoiding the local search, (2) solving the problem of premature convergence, and (3) balancing between the exploitation and exploration phases. The mHGS has been evaluated by using the IEEE Congress on Evolutionary Computation 2020 (CEC'20) for optimization test and ten medical and chemical datasets. The data have dimensions up to 20000 features or more. The results of the proposed algorithm have been compared to a variety of well-known optimization methods, including improved multi-operator differential evolution algorithm (IMODE), gravitational search algorithm, grey wolf optimization, Harris Hawks optimization, whale optimization algorithm, slime mould algorithm and hunger search games search. The experimental results suggest that the proposed mHGS can generate effective search results without increasing the computational cost and improving the convergence speed. It has also improved the SVM classification performance.
特征选择(FS)是数据挖掘和机器学习中基本的数据预处理步骤之一。它用于减小特征规模并提高模型泛化能力。除了最小化特征维度外,它还能提高分类准确率并降低模型复杂度,这在多个应用中至关重要。由于搜索空间较大,传统的特征选择方法往往无法找到最优全局解。许多混合技术已被提出,这些技术依赖于合并几种搜索策略,而这些策略曾被单独用作解决FS问题的方法。本研究提出一种改进的饥饿游戏搜索算法(mHGS),用于解决优化和FS问题。所提出的mHGS的主要优点是解决原始HGS中出现的以下缺点:(1)避免局部搜索,(2)解决早熟收敛问题,以及(3)在利用和探索阶段之间取得平衡。已使用2020年IEEE进化计算大会(CEC'20)对mHGS进行优化测试评估,并使用了十个医学和化学数据集。这些数据的维度高达20000个特征或更多。所提出算法的结果已与多种知名优化方法进行比较,包括改进的多算子差分进化算法(IMODE)、引力搜索算法、灰狼优化算法、哈里斯鹰优化算法、鲸鱼优化算法、黏菌算法和饥饿搜索游戏搜索算法。实验结果表明,所提出的mHGS能够在不增加计算成本的情况下生成有效的搜索结果,并提高收敛速度。它还提升了支持向量机(SVM)的分类性能。