Aksu Yaman, Miller David J, Kesidis George, Yang Qing X
Electrical Engineering Department, Pennsylvania State University, University Park, PA 16802, USA.
IEEE Trans Neural Netw. 2010 May;21(5):701-17. doi: 10.1109/TNN.2010.2041069. Epub 2010 Feb 25.
Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature "markers." For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom--the hyperplane's intercept and its squared 2-norm--with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer's disease brain image data, MFE methods give promising results.
高维空间中用于分类的特征选择可以提高泛化能力、降低分类器复杂度,并识别重要的、具有区分性的特征“标记”。对于支持向量机(SVM)分类,一种广泛使用的技术是递归特征消除(RFE)。我们证明RFE与SVM学习方法的核心——最大化间隔不一致。因此,我们为支持向量机提出了基于间隔的显式特征消除(MFE),并证明与RFE相比,它在间隔和泛化能力方面都有所提高。此外,对于非线性核的情况,我们表明RFE假设随着特征的消除,平方权重向量2-范数严格递减。我们证明对于高斯核并非如此,因此,在这种情况下RFE可能会给出较差的结果。非线性核的MFE能提供更好的间隔和泛化能力。我们还提出了一种扩展方法,通过仅优化两个自由度——超平面的截距及其平方2-范数——同时固定权重向量方向,从而实现进一步的间隔增益。我们最终引入了一种允许间隔松弛的扩展方法。我们与几种替代方法进行了比较,包括RFE和一种在分类器设计中嵌入特征选择的线性规划方法。在高维基因微阵列数据集、加州大学欧文分校(UCI)存储库数据集以及阿尔茨海默病脑图像数据上,MFE方法取得了有前景的结果。