Shen Dan, Shen Haipeng, Zhu Hongtu, Marron J S
University of South Florida.
University of Hong Kong.
Stat Sin. 2016 Oct;26(4):1747-1770. doi: 10.5705/ss.202015.0088.
The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the ratio between the dimension and the product of the sample size with the spike size. When this ratio converges to a nonzero constant, the sample eigenvector converges to a cone, with a certain angle to its corresponding population eigenvector. In the High Dimension, Low Sample Size case, the angle between the sample eigenvector and its population counterpart converges to a limiting distribution. Several generalizations of the multi-spike covariance models are also explored, and additional theoretical results are presented.
本文旨在建立多分量尖峰协方差模型主成分分析的若干深层理论性质。我们的新结果揭示了在具有可区分(或不可区分)特征值的尖峰模型下,当样本量和/或变量数量(或维度)趋于无穷大时,临界样本特征方向上的渐近锥形结构。样本特征向量相对于其总体对应向量的一致性由维度与样本量和尖峰大小乘积的比率决定。当该比率收敛到一个非零常数时,样本特征向量收敛到一个锥体,与相应的总体特征向量成一定角度。在高维小样本量的情况下,样本特征向量与其总体对应向量之间的角度收敛到一个极限分布。我们还探讨了多尖峰协方差模型的几种推广,并给出了额外的理论结果。