Hsieh Pi-Fuei, Wang Deng-Shiang, Hsu Chia-Wei
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan.
IEEE Trans Pattern Anal Mach Intell. 2006 Feb;28(2):223-35. doi: 10.1109/TPAMI.2006.26.
A parametric linear feature extraction method is proposed for multiclass classification. The skeleton of the proposed method consists of two types of schemes that are complementary to each other with regard to the discriminant information used. The approximate pairwise accuracy criterion (aPAC) and the common-mean feature extraction (CMFE) are chosen to exploit the discriminant information about class mean and about class covariance, respectively. Choosing aPAC rather than the linear discriminant analysis (LDA) can also resolve the problem of overemphasized large distances introduced by LDA, while maintaining other decent properties of LDA. To alleviate the suboptimum problem caused by a direct cascading of the two different types of schemes, there should be a mechanism for sorting and merging features based on their effectiveness. Usage of a sample-based classification error estimation for evaluation of effectiveness of features usually costs a lot of computational time. Therefore, we develop a fast spanning-tree-based parametric classification accuracy estimator as an intermediary for the aPAC and CMFE combination. The entire framework is parametric-based. This avoids paying a costly price in computation, which normally happens to the sample-based approach. Our experiments have shown that the proposed method can achieve a satisfactory performance on real data as well as simulated data.
针对多类分类问题,提出了一种参数线性特征提取方法。该方法的框架由两种类型的方案组成,这两种方案在使用的判别信息方面相互补充。分别选择近似成对准确率准则(aPAC)和共同均值特征提取(CMFE)来利用关于类均值和类协方差的判别信息。选择aPAC而非线性判别分析(LDA)还可以解决LDA引入的过度强调大距离的问题,同时保留LDA的其他良好特性。为了缓解由两种不同类型方案直接级联导致的次优问题,应该有一种基于特征有效性进行排序和合并的机制。使用基于样本的分类误差估计来评估特征的有效性通常会花费大量计算时间。因此,我们开发了一种基于快速生成树的参数分类准确率估计器,作为aPAC和CMFE组合的中介。整个框架是基于参数的。这避免了在计算中付出高昂代价,而这在基于样本的方法中通常会发生。我们的实验表明,所提出的方法在真实数据和模拟数据上都能取得令人满意的性能。