Fan Jianqing, Liu Han, Wang Weichen
Dept of Operations Research & Financial Engineering, Sherrerd Hall, Princeton University, Princeton, NJ 08544, USA.
Ann Stat. 2018 Aug;46(4):1383-1414. doi: 10.1214/17-AOS1588. Epub 2018 Jun 27.
We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix. As a new theoretical contribution, for the first time, such a framework allows us to exploit conditional sparsity covariance structure for the heavy-tailed data. In particular, for the elliptical distribution, we propose a robust estimator based on the marginal and spatial Kendall's tau to satisfy these conditions. In addition, we study conditional graphical model under the same framework. The technical tools developed in this paper are of general interest to high dimensional principal component analysis. Thorough numerical results are also provided to back up the developed theory.
我们基于近似因子模型提出了一个用于大规模协方差矩阵估计的通用主正交补阈值化(POET)框架。建立了一组高级充分条件,以使该过程在不同矩阵范数下实现最优收敛速率,从而更好地理解POET的工作原理。这样一个框架使我们能够以一种更透明的方式恢复次高斯数据的现有结果,这种方式仅依赖于样本协方差矩阵的集中特性。作为一项新的理论贡献,该框架首次使我们能够利用重尾数据的条件稀疏协方差结构。特别是对于椭圆分布,我们基于边际和空间肯德尔秩相关系数提出了一种稳健估计器以满足这些条件。此外,我们在相同框架下研究条件图形模型。本文开发的技术工具对高维主成分分析具有普遍意义。还提供了详尽的数值结果来支持所发展的理论。