IEEE Trans Cybern. 2016 Jan;46(1):206-18. doi: 10.1109/TCYB.2015.2399456. Epub 2015 Feb 26.
This paper introduces a graph-based semi-supervised embedding method as well as its kernelized version for generic classification and recognition tasks. The aim is to combine the merits of flexible manifold embedding and nonlinear graph-based embedding for semi-supervised learning. The proposed linear method will be flexible since it estimates a nonlinear manifold that is the closest one to a linear embedding. The proposed kernelized method will also be flexible since it estimates a kernel-based embedding that is the closest to a nonlinear manifold. In both proposed methods, the nonlinear manifold and the mapping (linear transform for the linear method and the kernel multipliers for the kernelized method) are simultaneously estimated, which overcomes the shortcomings of a cascaded estimation. The dimension of the final embedding obtained by the two proposed methods is not limited to the number of classes. They can be used by any kind of classifiers once the data are embedded into the new subspaces. Unlike nonlinear dimensionality reduction approaches, which suffer from out-of-sample problem, our proposed methods have an obvious advantage that the learnt subspace has a direct out-of-sample extension to novel samples, and are thus easily generalized to the entire high-dimensional input space. We provide extensive experiments on seven public databases in order to study the performance of the proposed methods. These experiments demonstrate much improvement over the state-of-the-art algorithms that are based on label propagation or graph-based semi-supervised embedding.
本文提出了一种基于图的半监督嵌入方法及其核化版本,用于通用分类和识别任务。目的是结合灵活流形嵌入和基于图的非线性嵌入在半监督学习中的优点。所提出的线性方法将是灵活的,因为它估计了一个最接近线性嵌入的非线性流形。所提出的核化方法也将是灵活的,因为它估计了一个最接近非线性流形的基于核的嵌入。在所提出的两种方法中,非线性流形和映射(线性方法的线性变换和核化方法的核乘数)同时被估计,这克服了级联估计的缺点。两种方法获得的最终嵌入的维度不限于类别数。一旦数据被嵌入到新的子空间中,它们就可以被任何类型的分类器使用。与非线性降维方法不同,它们存在样本外问题,我们提出的方法具有明显的优势,即学习的子空间可以直接扩展到新样本,因此很容易推广到整个高维输入空间。我们在七个公共数据库上进行了广泛的实验,以研究所提出方法的性能。这些实验表明,与基于标签传播或基于图的半监督嵌入的最新算法相比,有了很大的改进。