Suppr超能文献

通过阈值化主正交补进行大协方差估计

Large Covariance Estimation by Thresholding Principal Orthogonal Complements.

作者信息

Fan Jianqing, Liao Yuan, Mincheva Martina

机构信息

Department of Operations Research and Financial Engineering, Princeton University ; Bendheim Center for Finance, Princeton University.

Department of Mathematics, University of Maryland.

出版信息

J R Stat Soc Series B Stat Methodol. 2013 Sep 1;75(4). doi: 10.1111/rssb.12016.

Abstract

This paper deals with the estimation of a high-dimensional covariance with a conditional sparsity structure and fast-diverging eigenvalues. By assuming sparse error covariance matrix in an approximate factor model, we allow for the presence of some cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure with sparsity. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan, and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specific examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high-dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also verified by extensive simulation studies. Finally, a real data application on portfolio allocation is presented.

摘要

本文研究具有条件稀疏结构和快速发散特征值的高维协方差估计问题。通过在近似因子模型中假设误差协方差矩阵稀疏,即使在去除共同但不可观测的因子后,我们仍允许存在一些横截面相关性。我们引入主正交补阈值法(POET)来探索这种具有稀疏性的近似因子结构。POET估计器包括样本协方差矩阵、基于因子的协方差矩阵(范剑青、范剑青和吕晓玲,2008)、阈值估计器(比克尔和列维纳,2008)以及自适应阈值估计器(蔡和刘,2011)作为具体例子。当因子分析与高维数据的主成分分析近似相同时,我们给出了数学见解。在各种范数下研究了稀疏残差协方差矩阵和条件稀疏协方差矩阵的收敛速度。结果表明,随着维度增加,估计未知因子的影响逐渐消失。推导了未观测因子及其因子载荷的一致收敛速度。大量模拟研究也验证了渐近结果。最后,给出了一个投资组合分配的实际数据应用。

相似文献

1
Large Covariance Estimation by Thresholding Principal Orthogonal Complements.
J R Stat Soc Series B Stat Methodol. 2013 Sep 1;75(4). doi: 10.1111/rssb.12016.
2
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.
Ann Stat. 2017 Jun;45(3):1342-1374. doi: 10.1214/16-AOS1487. Epub 2017 Jun 13.
3
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Ann Stat. 2011 Jan 1;39(6):3320-3356. doi: 10.1214/11-AOS944.
4
LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS.
Ann Stat. 2018 Aug;46(4):1383-1414. doi: 10.1214/17-AOS1588. Epub 2018 Jun 27.
5
Threshold selection for covariance estimation.
Biometrics. 2019 Sep;75(3):895-905. doi: 10.1111/biom.13048. Epub 2019 Apr 3.
6
Robust High-dimensional Volatility Matrix Estimation for High-Frequency Factor Model.
J Am Stat Assoc. 2018;113(523):1268-1283. doi: 10.1080/01621459.2017.1340888. Epub 2018 Oct 8.
7
Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.
Probab Theory Relat Fields. 2015 Apr 1;161(3-4):781-815. doi: 10.1007/s00440-014-0562-z.
8
Robust Covariance Matrix Estimation for High-Dimensional Compositional Data with Application to Sales Data Analysis.
J Bus Econ Stat. 2023;41(4):1090-1100. doi: 10.1080/07350015.2022.2106990. Epub 2022 Sep 21.
9
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.
Int J Biostat. 2017 Sep 21;13(2):/j/ijb.2017.13.issue-2/ijb-2017-0013/ijb-2017-0013.xml. doi: 10.1515/ijb-2017-0013.
10
Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.
Ann Stat. 2009;37(6B):4254-4278. doi: 10.1214/09-AOS720.

引用本文的文献

1
Large Precision Matrix Estimation with Unknown Group Structure.
J Am Stat Assoc. 2025 Feb 10. doi: 10.1080/01621459.2024.2442092.
2
Fast Variational Inference for Bayesian Factor Analysis in Single and Multi-Study Settings.
J Comput Graph Stat. 2025;34(1):96-108. doi: 10.1080/10618600.2024.2356173. Epub 2024 Jul 17.
3
Estimation of the number of spiked eigenvalues in a covariance matrix by bulk eigenvalue matching analysis.
J Am Stat Assoc. 2023;118(541):374-392. doi: 10.1080/01621459.2021.1933497. Epub 2021 Jul 23.
5
Are Latent Factor Regression and Sparse Regression Adequate?
J Am Stat Assoc. 2024;119(546):1076-1088. doi: 10.1080/01621459.2023.2169700. Epub 2023 Feb 14.
7
Selective Inference for Hierarchical Clustering.
J Am Stat Assoc. 2024;119(545):332-342. doi: 10.1080/01621459.2022.2116331. Epub 2022 Oct 11.
8
Scattering spectra models for physics.
PNAS Nexus. 2024 Mar 7;3(4):pgae103. doi: 10.1093/pnasnexus/pgae103. eCollection 2024 Apr.
10
HIGH-DIMENSIONAL FACTOR REGRESSION FOR HETEROGENEOUS SUBPOPULATIONS.
Stat Sin. 2023 Jan;33(1):27-53. doi: 10.5705/ss.202020.0145.

本文引用的文献

1
MINIMAX BOUNDS FOR SPARSE PCA WITH NOISY HIGH-DIMENSIONAL DATA.
Ann Stat. 2013 Jun;41(3):1055-1084. doi: 10.1214/12-AOS1014.
2
Estimating False Discovery Proportion Under Arbitrary Covariance Dependence.
J Am Stat Assoc. 2012;107(499):1019-1035. doi: 10.1080/01621459.2012.720478.
3
Vast Portfolio Selection with Gross-exposure Constraints().
J Am Stat Assoc. 2012;107(498):592-606. doi: 10.1080/01621459.2012.682825. Epub 2012 May 14.
4
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Ann Stat. 2011 Jan 1;39(6):3320-3356. doi: 10.1214/11-AOS944.
5
A flexible estimating equations approach for mapping function-valued traits.
Genetics. 2011 Sep;189(1):305-16. doi: 10.1534/genetics.111.129221. Epub 2011 Jul 29.
6
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics.
J Am Stat Assoc. 2008 Dec 1;103(484):1438-1456. doi: 10.1198/016214508000000869.
7
Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.
Ann Stat. 2009;37(6B):4254-4278. doi: 10.1214/09-AOS720.
8
Correlated z-values and the accuracy of large-scale statistical estimates.
J Am Stat Assoc. 2010 Sep 1;105(491):1042-1055. doi: 10.1198/jasa.2010.tm09129.
9
On Consistency and Sparsity for Principal Components Analysis in High Dimensions.
J Am Stat Assoc. 2009 Jun 1;104(486):682-693. doi: 10.1198/jasa.2009.0121.
10
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.
Biostatistics. 2009 Jul;10(3):515-34. doi: 10.1093/biostatistics/kxp008. Epub 2009 Apr 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验