Suppr超能文献

避免基于最优平均ℓ范数最大化的鲁棒主成分分析进行重建。

Avoiding Optimal Mean ℓ-Norm Maximization-Based Robust PCA for Reconstruction.

作者信息

Luo Minnan, Nie Feiping, Chang Xiaojun, Yang Yi, Hauptmann Alexander G, Zheng Qinghua

机构信息

SPKLSTN Lab, Department of Computer Science, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China

Center for OPTical Imagery Analysis and Learning, Northwestern Polytechnical University, Xi'an, Shaanxi 710072, China

出版信息

Neural Comput. 2017 Apr;29(4):1124-1150. doi: 10.1162/NECO_a_00937. Epub 2017 Jan 17.

Abstract

Robust principal component analysis (PCA) is one of the most important dimension-reduction techniques for handling high-dimensional data with outliers. However, most of the existing robust PCA presupposes that the mean of the data is zero and incorrectly utilizes the average of data as the optimal mean of robust PCA. In fact, this assumption holds only for the squared [Formula: see text]-norm-based traditional PCA. In this letter, we equivalently reformulate the objective of conventional PCA and learn the optimal projection directions by maximizing the sum of projected difference between each pair of instances based on [Formula: see text]-norm. The proposed method is robust to outliers and also invariant to rotation. More important, the reformulated objective not only automatically avoids the calculation of optimal mean and makes the assumption of centered data unnecessary, but also theoretically connects to the minimization of reconstruction error. To solve the proposed nonsmooth problem, we exploit an efficient optimization algorithm to soften the contributions from outliers by reweighting each data point iteratively. We theoretically analyze the convergence and computational complexity of the proposed algorithm. Extensive experimental results on several benchmark data sets illustrate the effectiveness and superiority of the proposed method.

摘要

稳健主成分分析(PCA)是处理含异常值的高维数据时最重要的降维技术之一。然而,现有的大多数稳健PCA都假定数据均值为零,并且错误地将数据的平均值用作稳健PCA的最优均值。实际上,该假设仅适用于基于平方[公式:见原文] -范数的传统PCA。在这封信中,我们对传统PCA的目标进行了等效重新表述,并通过基于[公式:见原文] -范数最大化每对实例之间投影差异的总和来学习最优投影方向。所提出的方法对异常值具有鲁棒性,并且对旋转具有不变性。更重要的是,重新表述的目标不仅自动避免了最优均值的计算,使得数据中心化的假设不再必要,而且在理论上与重建误差的最小化相关联。为了解决所提出的非光滑问题,我们利用一种高效的优化算法,通过对每个数据点进行迭代重新加权来减弱异常值的影响。我们从理论上分析了所提出算法的收敛性和计算复杂度。在几个基准数据集上的大量实验结果说明了所提出方法的有效性和优越性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验