文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

高维主成分分析:一种导向

PCA in High Dimensions: An orientation.

作者信息

Johnstone Iain M, Paul Debashis

机构信息

Department of Statistics, Stanford University, Stanford CA 94305.

Department of Statistics, University of California, Davis.

出版信息

Proc IEEE Inst Electr Electron Eng. 2018 Aug;106(8):1277-1292. doi: 10.1109/JPROC.2018.2846730. Epub 2018 Jul 18.


DOI:10.1109/JPROC.2018.2846730
PMID:30287970
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6167023/
Abstract

When the data are high dimensional, widely used multivariate statistical methods such as principal component analysis can behave in unexpected ways. In settings where the dimension of the observations is comparable to the sample size, upward bias in sample eigenvalues and inconsistency of sample eigenvectors are among the most notable phenomena that appear. These phenomena, and the limiting behavior of the rescaled extreme sample eigenvalues, have recently been investigated in detail under the spiked covariance model. The behavior of the bulk of the sample eigenvalues under weak distributional assumptions on the observations has been described. These results have been exploited to develop new estimation and hypothesis testing methods for the population covariance matrix. Furthermore, partly in response to these phenomena, alternative classes of estimation procedures have been developed by exploiting sparsity of the eigenvectors or the covariance matrix. This paper gives an orientation to these areas.

摘要

当数据是高维的时候,广泛使用的多元统计方法(如主成分分析)可能会表现出意想不到的情况。在观测维度与样本量相当的情况下,样本特征值的向上偏差和样本特征向量的不一致是最显著的现象。这些现象以及重新缩放后的极端样本特征值的极限行为,最近在尖峰协方差模型下得到了详细研究。在对观测值的弱分布假设下,已经描述了样本特征值主体的行为。这些结果已被用于开发针对总体协方差矩阵的新估计和假设检验方法。此外,部分是为了应对这些现象,通过利用特征向量或协方差矩阵的稀疏性,开发了替代类别的估计程序。本文对这些领域进行了介绍。

相似文献

[1]
PCA in High Dimensions: An orientation.

Proc IEEE Inst Electr Electron Eng. 2018-8

[2]
Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

Genetics. 2017-7

[3]
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

Ann Stat. 2018-8

[4]
Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.

Psychometrika. 2017-3

[5]
MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA.

Ann Stat. 2013-6

[6]
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

Ann Stat. 2017-6

[7]
CONVERGENCE AND PREDICTION OF PRINCIPAL COMPONENT SCORES IN HIGH-DIMENSIONAL SETTINGS.

Ann Stat. 2010-1-1

[8]
Asymptotic properties of principal component analysis and shrinkage-bias adjustment under the generalized spiked population model.

J Multivar Anal. 2019-9

[9]
EDGEWORTH CORRECTION FOR THE LARGEST EIGENVALUE IN A SPIKED PCA MODEL.

Stat Sin. 2018-10

[10]
Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models.

Stat Sin. 2021-4

引用本文的文献

[1]
Testing for differences in polygenic scores in the presence of confounding.

Genetics. 2025-6-4

[2]
Multivariate classification of livestock production systems in Mexico.

Trop Anim Health Prod. 2025-3-22

[3]
Semisynthetic simulation for microbiome data analysis.

Brief Bioinform. 2024-11-22

[4]
The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

Inf inference. 2025-1-16

[5]
Improved liver fat and quantification at 0.55 T using locally low-rank denoising.

Magn Reson Med. 2025-3

[6]
JASPER: Fast, powerful, multitrait association testing in structured samples gives insight on pleiotropy in gene expression.

Am J Hum Genet. 2024-8-8

[7]
Explaining deep learning-based representations of resting state functional connectivity data: focusing on interpreting nonlinear patterns in autism spectrum disorder.

Front Psychiatry. 2024-5-20

[8]
Biwhitening Reveals the Rank of a Count Matrix.

SIAM J Math Data Sci. 2022

[9]
Testing for differences in polygenic scores in the presence of confounding.

bioRxiv. 2024-6-26

[10]
Structural Analysis and Classification of Low-Molecular-Weight Hyaluronic Acid by Near-Infrared Spectroscopy: A Comparison between Traditional Machine Learning and Deep Learning.

Molecules. 2023-1-13

本文引用的文献

[1]
TRACY-WIDOM AT EACH EDGE OF REAL COVARIANCE AND MANOVA ESTIMATORS.

Ann Appl Probab. 2022-8

[2]
EIGENVALUE DISTRIBUTIONS OF VARIANCE COMPONENTS ESTIMATORS IN HIGH-DIMENSIONAL RANDOM EFFECTS MODELS.

Ann Stat. 2019-10

[3]
EDGEWORTH CORRECTION FOR THE LARGEST EIGENVALUE IN A SPIKED PCA MODEL.

Stat Sin. 2018-10

[4]
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

Ann Stat. 2018-8

[5]
Roy's largest root test under rank-one alternatives.

Biometrika. 2017-3

[6]
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

Ann Stat. 2017-6

[7]
The Statistics and Mathematics of High Dimension Low Sample Size Asymptotics.

Stat Sin. 2016-10

[8]
Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.

Probab Theory Relat Fields. 2015-4-1

[9]
MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA.

Ann Stat. 2013-6

[10]
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

J R Stat Soc Series B Stat Methodol. 2013-9-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索