Johnstone Iain M, Paul Debashis
Department of Statistics, Stanford University, Stanford CA 94305.
Department of Statistics, University of California, Davis.
Proc IEEE Inst Electr Electron Eng. 2018 Aug;106(8):1277-1292. doi: 10.1109/JPROC.2018.2846730. Epub 2018 Jul 18.
When the data are high dimensional, widely used multivariate statistical methods such as principal component analysis can behave in unexpected ways. In settings where the dimension of the observations is comparable to the sample size, upward bias in sample eigenvalues and inconsistency of sample eigenvectors are among the most notable phenomena that appear. These phenomena, and the limiting behavior of the rescaled extreme sample eigenvalues, have recently been investigated in detail under the spiked covariance model. The behavior of the bulk of the sample eigenvalues under weak distributional assumptions on the observations has been described. These results have been exploited to develop new estimation and hypothesis testing methods for the population covariance matrix. Furthermore, partly in response to these phenomena, alternative classes of estimation procedures have been developed by exploiting sparsity of the eigenvectors or the covariance matrix. This paper gives an orientation to these areas.
当数据是高维的时候,广泛使用的多元统计方法(如主成分分析)可能会表现出意想不到的情况。在观测维度与样本量相当的情况下,样本特征值的向上偏差和样本特征向量的不一致是最显著的现象。这些现象以及重新缩放后的极端样本特征值的极限行为,最近在尖峰协方差模型下得到了详细研究。在对观测值的弱分布假设下,已经描述了样本特征值主体的行为。这些结果已被用于开发针对总体协方差矩阵的新估计和假设检验方法。此外,部分是为了应对这些现象,通过利用特征向量或协方差矩阵的稀疏性,开发了替代类别的估计程序。本文对这些领域进行了介绍。
Proc IEEE Inst Electr Electron Eng. 2018-8
Ann Stat. 2018-8
Psychometrika. 2017-3
Ann Stat. 2013-6
Trop Anim Health Prod. 2025-3-22
Brief Bioinform. 2024-11-22
Inf inference. 2025-1-16
Magn Reson Med. 2025-3
SIAM J Math Data Sci. 2022
Ann Appl Probab. 2022-8
Ann Stat. 2018-8
Biometrika. 2017-3
Probab Theory Relat Fields. 2015-4-1
Ann Stat. 2013-6
J R Stat Soc Series B Stat Methodol. 2013-9-1