Suppr超能文献

高维主成分分析:一种导向

PCA in High Dimensions: An orientation.

作者信息

Johnstone Iain M, Paul Debashis

机构信息

Department of Statistics, Stanford University, Stanford CA 94305.

Department of Statistics, University of California, Davis.

出版信息

Proc IEEE Inst Electr Electron Eng. 2018 Aug;106(8):1277-1292. doi: 10.1109/JPROC.2018.2846730. Epub 2018 Jul 18.

Abstract

When the data are high dimensional, widely used multivariate statistical methods such as principal component analysis can behave in unexpected ways. In settings where the dimension of the observations is comparable to the sample size, upward bias in sample eigenvalues and inconsistency of sample eigenvectors are among the most notable phenomena that appear. These phenomena, and the limiting behavior of the rescaled extreme sample eigenvalues, have recently been investigated in detail under the spiked covariance model. The behavior of the bulk of the sample eigenvalues under weak distributional assumptions on the observations has been described. These results have been exploited to develop new estimation and hypothesis testing methods for the population covariance matrix. Furthermore, partly in response to these phenomena, alternative classes of estimation procedures have been developed by exploiting sparsity of the eigenvectors or the covariance matrix. This paper gives an orientation to these areas.

摘要

当数据是高维的时候,广泛使用的多元统计方法(如主成分分析)可能会表现出意想不到的情况。在观测维度与样本量相当的情况下,样本特征值的向上偏差和样本特征向量的不一致是最显著的现象。这些现象以及重新缩放后的极端样本特征值的极限行为,最近在尖峰协方差模型下得到了详细研究。在对观测值的弱分布假设下,已经描述了样本特征值主体的行为。这些结果已被用于开发针对总体协方差矩阵的新估计和假设检验方法。此外,部分是为了应对这些现象,通过利用特征向量或协方差矩阵的稀疏性,开发了替代类别的估计程序。本文对这些领域进行了介绍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/735b/6167023/ced5156719ef/nihms-1503591-f0003.jpg

相似文献

1
PCA in High Dimensions: An orientation.高维主成分分析:一种导向
Proc IEEE Inst Electr Electron Eng. 2018 Aug;106(8):1277-1292. doi: 10.1109/JPROC.2018.2846730. Epub 2018 Jul 18.
3
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.尖峰协方差模型中特征值的最优收缩
Ann Stat. 2018 Aug;46(4):1742-1778. doi: 10.1214/17-AOS1601. Epub 2018 Jun 27.

引用本文的文献

2
Multivariate classification of livestock production systems in Mexico.墨西哥畜牧生产系统的多变量分类
Trop Anim Health Prod. 2025 Mar 22;57(3):140. doi: 10.1007/s11250-025-04389-5.
8
Biwhitening Reveals the Rank of a Count Matrix.双白化揭示计数矩阵的秩。
SIAM J Math Data Sci. 2022;4(4):1420-1446. doi: 10.1137/21m1456807.

本文引用的文献

4
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.尖峰协方差模型中特征值的最优收缩
Ann Stat. 2018 Aug;46(4):1742-1778. doi: 10.1214/17-AOS1601. Epub 2018 Jun 27.
5
Roy's largest root test under rank-one alternatives.在一阶备择假设下的罗伊最大根检验。
Biometrika. 2017 Mar;104(1):181-193. doi: 10.1093/biomet/asw060. Epub 2017 Jan 13.
8
Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.稀疏尖峰协方差矩阵的最优估计与秩检测
Probab Theory Relat Fields. 2015 Apr 1;161(3-4):781-815. doi: 10.1007/s00440-014-0562-z.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验