Suppr超能文献

使用ε-不敏感稳健损失函数的核成分分析

Kernel component analysis using an epsilon-insensitive robust loss function.

作者信息

Alzate Carlos, Suykens Johan A K

机构信息

Department of Electrical Engineering ESAT-SCDSISTA, Katholieke Universiteit Leuven, B-3001 Leuven, Belgium.

出版信息

IEEE Trans Neural Netw. 2008 Sep;19(9):1583-98. doi: 10.1109/TNN.2008.2000443.

Abstract

Kernel principal component analysis (PCA) is a technique to perform feature extraction in a high-dimensional feature space, which is nonlinearly related to the original input space. The kernel PCA formulation corresponds to an eigendecomposition of the kernel matrix: eigenvectors with large eigenvalues correspond to the principal components in the feature space. Starting from the least squares support vector machine (LS-SVM) formulation to kernel PCA, we extend it to a generalized form of kernel component analysis (KCA) with a general underlying loss function made explicit. For classical kernel PCA, the underlying loss function is L(2) . In this generalized form, one can plug in also other loss functions. In the context of robust statistics, it is known that the L(2) loss function is not robust because its influence function is not bounded. Therefore, outliers can skew the solution from the desired one. Another issue with kernel PCA is the lack of sparseness: the principal components are dense expansions in terms of kernel functions. In this paper, we introduce robustness and sparseness into kernel component analysis by using an epsilon-insensitive robust loss function. We propose two different algorithms. The first method solves a set of nonlinear equations with kernel PCA as starting points. The second method uses a simplified iterative weighting procedure that leads to solving a sequence of generalized eigenvalue problems. Simulations with toy and real-life data show improvements in terms of robustness together with a sparse representation.

摘要

核主成分分析(PCA)是一种在高维特征空间中执行特征提取的技术,该空间与原始输入空间呈非线性关系。核主成分分析公式对应于核矩阵的特征分解:具有大特征值的特征向量对应于特征空间中的主成分。从最小二乘支持向量机(LS-SVM)公式到核主成分分析,我们将其扩展为核成分分析(KCA)的广义形式,并明确给出了一般的潜在损失函数。对于经典核主成分分析,潜在损失函数是L(2) 。在这种广义形式中,也可以代入其他损失函数。在稳健统计的背景下,已知L(2) 损失函数不稳健,因为其影响函数无界。因此,异常值可能会使解偏离期望的解。核主成分分析的另一个问题是缺乏稀疏性:主成分是核函数的密集展开。在本文中,我们通过使用ε-不敏感稳健损失函数将稳健性和稀疏性引入核成分分析。我们提出了两种不同的算法。第一种方法以核主成分分析为起点求解一组非线性方程。第二种方法使用简化的迭代加权过程,该过程导致求解一系列广义特征值问题。使用玩具数据和实际数据进行的模拟显示,在稳健性方面有改进,同时具有稀疏表示。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验