• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

尖峰协方差模型中特征值的最优收缩

Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.

作者信息

Donoho David L, Gavish Matan, Johnstone Iain M

机构信息

Department of Statistics, Stanford University.

School of Computer Science and Engineering, Hebrew University of Jerusalem.

出版信息

Ann Stat. 2018 Aug;46(4):1742-1778. doi: 10.1214/17-AOS1601. Epub 2018 Jun 27.

DOI:10.1214/17-AOS1601
PMID:30258255
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6152949/
Abstract

We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker * dominating all other shrinkers. The shape of the optimal shrinker is determined by the choice of loss function and, crucially, by inconsistency of both eigenvalues eigenvectors of the sample covariance matrix. Details of these phenomena and closed form formulas for the optimal eigenvalue shrinkers are worked out for a menagerie of 26 loss functions for covariance estimation found in the literature, including the Stein, Entropy, Divergence, Fréchet, Bhattacharya/Matusita, Frobenius Norm, Operator Norm, Nuclear Norm and Condition Number losses.

摘要

我们表明,在一个常见的高维协方差模型中,损失函数的选择对最优估计有深远影响。在基于尖峰协方差模型的渐近框架以及使用正交不变估计器的情况下,我们表明总体协方差矩阵的最优估计归结为设计一个对样本特征值逐元素起作用的最优收缩器。实际上,对于每个损失函数,都对应一个唯一的可容许特征值收缩器*,它优于所有其他收缩器。最优收缩器的形状由损失函数的选择决定,关键是由样本协方差矩阵的特征值和特征向量的不一致性决定。对于文献中发现的用于协方差估计的26种损失函数,包括斯坦因、熵、散度、弗雷歇、巴塔查里亚/马图西塔、弗罗贝尼乌斯范数、算子范数、核范数和条件数损失,详细阐述了这些现象以及最优特征值收缩器的闭式公式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/01d661fbc71a/nihms-985287-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/f811ee673e87/nihms-985287-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/4a0505b205d1/nihms-985287-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/01d661fbc71a/nihms-985287-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/f811ee673e87/nihms-985287-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/4a0505b205d1/nihms-985287-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c357/6152949/01d661fbc71a/nihms-985287-f0003.jpg

相似文献

1
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.尖峰协方差模型中特征值的最优收缩
Ann Stat. 2018 Aug;46(4):1742-1778. doi: 10.1214/17-AOS1601. Epub 2018 Jun 27.
2
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.高维尖峰协方差的经验特征结构渐近性
Ann Stat. 2017 Jun;45(3):1342-1374. doi: 10.1214/16-AOS1487. Epub 2017 Jun 13.
3
Estimation of Large-Dimensional Covariance Matrices via Second-Order Stein-Type Regularization.通过二阶斯坦因型正则化估计大维度协方差矩阵
Entropy (Basel). 2022 Dec 27;25(1):53. doi: 10.3390/e25010053.
4
Inference on the Eigenvalues of the Normalized Precision Matrix.关于归一化精度矩阵特征值的推断
Linear Algebra Appl. 2024 Dec 15;703:78-108. doi: 10.1016/j.laa.2024.09.002. Epub 2024 Sep 10.
5
PCA in High Dimensions: An orientation.高维主成分分析:一种导向
Proc IEEE Inst Electr Electron Eng. 2018 Aug;106(8):1277-1292. doi: 10.1109/JPROC.2018.2846730. Epub 2018 Jul 18.
6
Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models.高维尖峰模型样本相关矩阵特征结构的渐近性
Stat Sin. 2021 Apr;31(2):571-601. doi: 10.5705/ss.202019.0052.
7
An Orthogonally Equivariant Estimator of the Covariance Matrix in High Dimensions and for Small Sample Sizes.高维小样本协方差矩阵的正交等变估计量
J Stat Plan Inference. 2021 Jul;213:16-32. doi: 10.1016/j.jspi.2020.10.006. Epub 2020 Nov 16.
8
Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.稀疏尖峰协方差矩阵的最优估计与秩检测
Probab Theory Relat Fields. 2015 Apr 1;161(3-4):781-815. doi: 10.1007/s00440-014-0562-z.
9
Asymptotic properties of principal component analysis and shrinkage-bias adjustment under the generalized spiked population model.广义尖峰总体模型下主成分分析的渐近性质及收缩偏差调整
J Multivar Anal. 2019 Sep;173:145-164. doi: 10.1016/j.jmva.2019.02.007. Epub 2019 Feb 19.
10
Optimal covariance cleaning for heavy-tailed distributions: Insights from information theory.重尾分布的最优协方差清理:来自信息论的见解。
Phys Rev E. 2023 Nov;108(5-1):054133. doi: 10.1103/PhysRevE.108.054133.

引用本文的文献

1
Method of moments for 3D single particle modeling with non-uniform distribution of viewing angles.用于三维单粒子建模且视角分布不均匀的矩量法。
Inverse Probl. 2020 Apr;36(4). doi: 10.1088/1361-6420/ab6139. Epub 2020 Feb 26.
2
Estimation of the number of spiked eigenvalues in a covariance matrix by bulk eigenvalue matching analysis.通过整体特征值匹配分析估计协方差矩阵中尖峰特征值的数量
J Am Stat Assoc. 2023;118(541):374-392. doi: 10.1080/01621459.2021.1933497. Epub 2021 Jul 23.
3
Mode-wise principal subspace pursuit and matrix spiked covariance model.

本文引用的文献

1
Tail sums of Wishart and Gaussian eigenvalues beyond the bulk edge.威沙特矩阵和高斯矩阵特征值在主体边缘之外的尾项和。
Aust N Z J Stat. 2018 Mar;60(1):65-74. doi: 10.1111/anzs.12201. Epub 2018 Mar 14.
2
Condition Number Regularized Covariance Estimation.条件数正则化协方差估计
J R Stat Soc Series B Stat Methodol. 2013 Jun 1;75(3):427-450. doi: 10.1111/j.1467-9868.2012.01049.x.
3
Shrinkage estimators for covariance matrices.协方差矩阵的收缩估计量。
模式主性子空间追踪与矩阵尖峰协方差模型
J R Stat Soc Series B Stat Methodol. 2024 Sep 2;87(1):232-255. doi: 10.1093/jrsssb/qkae088. eCollection 2025 Feb.
4
An Empirical Bayes Approach to Shrinkage Estimation on the Manifold of Symmetric Positive-Definite Matrices.一种基于经验贝叶斯方法的对称正定矩阵流形上的收缩估计
J Am Stat Assoc. 2024;119(545):259-272. doi: 10.1080/01621459.2022.2110877. Epub 2022 Sep 27.
5
Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix.利用汇总统计数据和连锁不平衡矩阵准确高效地估计局部遗传率。
Nat Commun. 2023 Dec 2;14(1):7954. doi: 10.1038/s41467-023-43565-9.
6
Fast principal component analysis for cryo-electron microscopy images.用于冷冻电子显微镜图像的快速主成分分析
Biol Imaging. 2023;3. doi: 10.1017/s2633903x23000028. Epub 2023 Feb 3.
7
Linear Hypothesis Testing in Linear Models With High-Dimensional Responses.具有高维响应的线性模型中的线性假设检验
J Am Stat Assoc. 2022;117(540):1738-1750. doi: 10.1080/01621459.2021.1884561. Epub 2021 Apr 27.
8
Estimating linkage disequilibrium and selection from allele frequency trajectories.从等位基因频率轨迹估计连锁不平衡和选择。
Genetics. 2023 Mar 2;223(3). doi: 10.1093/genetics/iyac189.
9
James-Stein for the leading eigenvector.詹姆斯-斯廷为特征向量的主导。
Proc Natl Acad Sci U S A. 2023 Jan 10;120(2):e2207046120. doi: 10.1073/pnas.2207046120. Epub 2023 Jan 5.
10
How to reduce dimension with PCA and random projections?如何使用主成分分析(PCA)和随机投影进行降维?
IEEE Trans Inf Theory. 2021 Dec;67(12):8154-8189. doi: 10.1109/tit.2021.3112821. Epub 2021 Sep 14.
Biometrics. 2001 Dec;57(4):1173-84. doi: 10.1111/j.0006-341x.2001.01173.x.