Gribonval Rémi
French National Center for Computer Science and Control (INRIA) at IRISA, 35042 Rennes, France.
IEEE Trans Neural Netw. 2005 May;16(3):522-32. doi: 10.1109/TNN.2005.844900.
While many efforts have been put into the development of nonlinear approximation theory and its applications to signal and image compression, encoding and denoising, there seems to be very few theoretical developments of adaptive discriminant representations in the area of feature extraction, selection and signal classification. In this paper, we try to advocate the idea that such developments and efforts are worthwhile, based on the theorerical study of a data-driven discriminant analysis method on a simple--yet instructive--example. We consider the problem of classifying a signal drawn from a mixture of two classes, using its projections onto low-dimensional subspaces. Unlike the linear discriminant analysis (LDA) strategy, which selects subspaces that do not depend on the observed signal, we consider an adaptive sequential selection of projections, in the spirit of nonlinear approximation and classification and regression trees (CART): at each step, the subspace is enlarged in a direction that maximizes the mutual information with the unknown class. We derive explicit characterizations of this adaptive discriminant analysis (ADA) strategy in two situations. When the two classes are Gaussian with the same covariance matrix but different means, the adaptive subspaces are actually nonadaptive and can be computed with an algorithm similar to orthonormal matching pursuit. When the classes are centered Gaussians with different covariances, the adaptive subspaces are spanned by eigen-vectors of an operator given by the covariance matrices (just as could be predicted by regular LDA), however we prove that the order of observation of the components along these eigen-vectors actually depends on the observed signal. Numerical experiments on synthetic data illustrate how data-dependent features can be used to outperform LDA on a classification task, and we discuss how our results could be applied in practice.
虽然在非线性逼近理论及其在信号与图像压缩、编码和去噪方面的应用上已经付出了很多努力,但在特征提取、选择和信号分类领域,自适应判别表示的理论发展似乎非常少。在本文中,我们基于对一个简单但具有启发性的示例上的数据驱动判别分析方法的理论研究,试图倡导这样的发展和努力是值得的这一观点。我们考虑从两类混合中抽取的信号的分类问题,利用其在低维子空间上的投影。与线性判别分析(LDA)策略不同,LDA选择不依赖于观测信号的子空间,我们本着非线性逼近以及分类与回归树(CART)的精神,考虑投影的自适应顺序选择:在每一步,子空间朝着与未知类别互信息最大化的方向扩大。我们在两种情况下推导了这种自适应判别分析(ADA)策略的明确特征。当两类是具有相同协方差矩阵但不同均值的高斯分布时,自适应子空间实际上是非自适应的,并且可以用类似于正交匹配追踪的算法来计算。当类别是具有不同协方差的中心高斯分布时,自适应子空间由协方差矩阵给出的算子的特征向量张成(正如正则LDA所预测的那样),然而我们证明沿着这些特征向量对分量的观测顺序实际上取决于观测信号。对合成数据的数值实验说明了如何利用依赖于数据的特征在分类任务中优于LDA,并且我们讨论了我们的结果在实际中如何应用。