Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Ibaraki 305-8573, Japan
Institute of Statistical Mathematics, Tachikawa, Tokyo 190-8562, Japan, and Center for Advanced Intelligence Project, RIKEN Chuo-ku, Tokyo 103-0027, Japan
Neural Comput. 2020 Oct;32(10):1901-1935. doi: 10.1162/neco_a_01308. Epub 2020 Aug 14.
Principal component analysis (PCA) is a widely used method for data processing, such as for dimension reduction and visualization. Standard PCA is known to be sensitive to outliers, and various robust PCA methods have been proposed. It has been shown that the robustness of many statistical methods can be improved using mode estimation instead of mean estimation, because mode estimation is not significantly affected by the presence of outliers. Thus, this study proposes a modal principal component analysis (MPCA), which is a robust PCA method based on mode estimation. The proposed method finds the minor component by estimating the mode of the projected data points. As a theoretical contribution, probabilistic convergence property, influence function, finite-sample breakdown point, and its lower bound for the proposed MPCA are derived. The experimental results show that the proposed method has advantages over conventional methods.
主成分分析(PCA)是一种广泛应用于数据处理的方法,例如降维和可视化。众所周知,标准 PCA 对离群值很敏感,因此已经提出了各种鲁棒 PCA 方法。已经表明,使用模式估计而不是均值估计可以提高许多统计方法的鲁棒性,因为模式估计不会受到离群值的显著影响。因此,本研究提出了一种模态主成分分析(MPCA),这是一种基于模式估计的鲁棒 PCA 方法。该方法通过估计投影数据点的模式来找到次要成分。作为理论贡献,推导了所提出的 MPCA 的概率收敛性质、影响函数、有限样本崩溃点及其下限。实验结果表明,该方法优于传统方法。