Department of Computer Science, Faner Hall, Mailcode 4511, Southern Illinois University Carbondale, 1000 Faner Drive, Carbondale, IL 62901, USA.
IEEE Trans Pattern Anal Mach Intell. 2011 Jun;33(6):1217-33. doi: 10.1109/TPAMI.2010.195.
Selecting features for multiclass classification is a critically important task for pattern recognition and machine learning applications. Especially challenging is selecting an optimal subset of features from high-dimensional data, which typically have many more variables than observations and contain significant noise, missing components, or outliers. Existing methods either cannot handle high-dimensional data efficiently or scalably, or can only obtain local optimum instead of global optimum. Toward the selection of the globally optimal subset of features efficiently, we introduce a new selector--which we call the Fisher-Markov selector--to identify those features that are the most useful in describing essential differences among the possible groups. In particular, in this paper we present a way to represent essential discriminating characteristics together with the sparsity as an optimization objective. With properly identified measures for the sparseness and discriminativeness in possibly high-dimensional settings, we take a systematic approach for optimizing the measures to choose the best feature subset. We use Markov random field optimization techniques to solve the formulated objective functions for simultaneous feature selection. Our results are noncombinatorial, and they can achieve the exact global optimum of the objective function for some special kernels. The method is fast; in particular, it can be linear in the number of features and quadratic in the number of observations. We apply our procedure to a variety of real-world data, including mid--dimensional optical handwritten digit data set and high-dimensional microarray gene expression data sets. The effectiveness of our method is confirmed by experimental results. In pattern recognition and from a model selection viewpoint, our procedure says that it is possible to select the most discriminating subset of variables by solving a very simple unconstrained objective function which in fact can be obtained with an explicit expression.
多类分类的特征选择对于模式识别和机器学习应用来说是一项非常重要的任务。尤其具有挑战性的是从高维数据中选择最佳的特征子集,这些数据通常具有比观测值多得多的变量,并且包含大量的噪声、缺失成分或异常值。现有的方法要么不能有效地处理高维数据,要么不能可扩展地处理,要么只能获得局部最优,而不是全局最优。为了有效地选择全局最优的特征子集,我们引入了一种新的选择器——我们称之为 Fisher-Markov 选择器——来识别那些在描述可能的群体之间的基本差异方面最有用的特征。特别是,在本文中,我们提出了一种将基本区分特征与稀疏性表示为优化目标的方法。在可能的高维环境中,通过适当识别稀疏性和区分性的度量标准,我们采取系统的方法来优化这些度量标准,以选择最佳的特征子集。我们使用马尔可夫随机场优化技术来解决所提出的目标函数,以进行同时的特征选择。我们的结果是非组合的,并且对于某些特殊核,它们可以达到目标函数的精确全局最优。该方法速度很快;特别是,它在特征数量上是线性的,在观测数量上是二次的。我们将我们的方法应用于各种真实世界的数据,包括中维光学手写数字数据集和高维微阵列基因表达数据集。实验结果证实了我们方法的有效性。在模式识别和模型选择的角度来看,我们的过程表明通过求解一个非常简单的无约束目标函数,实际上可以通过显式表达式来选择最具区分性的变量子集。