Sun Mingming, Yang Jian, Liu Chuancai, Yang Jingyu
Department of Computer Science, Nanjing University of Science and Technology, China.
IEEE Trans Neural Netw. 2010 Sep;21(9):1445-56. doi: 10.1109/TNN.2010.2048577. Epub 2010 Jun 21.
This paper discusses the problem of what kind of learning model is suitable for the tasks of feature extraction for data representation and suggests two evaluation criteria for nonlinear feature extractors: reconstruction error minimization and similarity preservation. Based on the suggested evaluation criteria, a new type of principal curve-similarity preserving principal curve (SPPC) is proposed. SPPCs minimize the reconstruction error under the condition that the similarity between similar samples are preserved in the extracted features, thus giving researchers effective and reliable cognition of the inner structure of data sets. The existence and properties of SPPCs are analyzed; a practical learning algorithm is proposed and high dimensional extensions of SPPCs are also discussed. Experimental results show the virtues of SPPCs in preserving inner structures of data sets and discovering manifolds with high nonlinearity.
本文讨论了哪种学习模型适用于数据表示的特征提取任务的问题,并提出了非线性特征提取器的两个评估标准:重建误差最小化和相似性保持。基于所提出的评估标准,提出了一种新型的主曲线——相似性保持主曲线(SPPC)。SPPC在提取的特征中保留相似样本之间相似性的条件下最小化重建误差,从而为研究人员提供对数据集内部结构的有效且可靠的认知。分析了SPPC的存在性和性质;提出了一种实用的学习算法,并讨论了SPPC的高维扩展。实验结果表明了SPPC在保留数据集内部结构和发现具有高度非线性的流形方面的优点。