College of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China.
College of Economics and Management, Nanjing Agricultural University, Nanjing 210095, China.
Neural Netw. 2024 Jan;169:597-606. doi: 10.1016/j.neunet.2023.10.037. Epub 2023 Nov 2.
In this research paper, we aim to investigate and address the limitations of recursive feature elimination (RFE) and its variants in high-dimensional feature selection tasks. We identify two main challenges associated with these methods. Firstly, the feature ranking criterion utilized in these approaches is inconsistent with the maximum-margin theory. Secondly, the computation of the criterion is performed locally, lacking the ability to measure the importance of features globally. To overcome these challenges, we propose a novel feature ranking criterion called Maximum Margin and Global (MMG) criterion. This criterion utilizes the classification margin to determine the importance of features and computes it globally, enabling a more accurate assessment of feature importance. Moreover, we introduce an optimal feature subset evaluation algorithm that leverages the MMG criterion to determine the best subset of features. To enhance the efficiency of the proposed algorithms, we provide two alpha seeding strategies that significantly reduce computational costs while maintaining high accuracy. These strategies offer a practical means to expedite the feature selection process. Through extensive experiments conducted on ten benchmark datasets, we demonstrate that our proposed algorithms outperform current state-of-the-art methods. Additionally, the alpha seeding strategies yield significant speedups, further enhancing the efficiency of the feature selection process.
在本研究论文中,我们旨在研究和解决递归特征消除(RFE)及其变体在高维特征选择任务中的局限性。我们确定了与这些方法相关的两个主要挑战。首先,这些方法中使用的特征排序准则与最大间隔理论不一致。其次,准则的计算是局部进行的,缺乏全局测量特征重要性的能力。为了克服这些挑战,我们提出了一种称为最大间隔和全局(MMG)准则的新特征排序准则。该准则利用分类间隔来确定特征的重要性,并全局计算,从而更准确地评估特征的重要性。此外,我们引入了一种最优特征子集评估算法,该算法利用 MMG 准则来确定最佳特征子集。为了提高所提出算法的效率,我们提供了两种 alpha 播种策略,这些策略在保持高精度的同时显著降低了计算成本。这些策略提供了一种实用的方法来加速特征选择过程。通过在十个基准数据集上进行的广泛实验,我们证明了我们提出的算法优于当前最先进的方法。此外,alpha 播种策略大大提高了速度,进一步提高了特征选择过程的效率。