Suppr超能文献

大协方差矩阵估计中的稀疏性与收敛速率

Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.

作者信息

Lam Clifford, Fan Jianqing

机构信息

Department of Statistics, London School of Economics and Political Science, London, WC2A 2AE (

出版信息

Ann Stat. 2009;37(6B):4254-4278. doi: 10.1214/09-AOS720.

Abstract

This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (s(n) log p(n)/n)(1/2), where s(n) is the number of nonzero elements, p(n) is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λ(n) goes to 0 have been made explicit and compared under different penalties. As a result, for the L(1)-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.

摘要

本文研究了基于具有非凸惩罚函数的惩罚似然估计稀疏协方差矩阵和精度矩阵时的稀疏一致性和收敛速度。这里,稀疏一致性指的是所有为零的参数实际上以趋于1的概率被估计为零的性质。根据应用情况,稀疏先验可能出现在协方差矩阵、其逆矩阵或其Cholesky分解上。我们在一个具有一般惩罚函数的统一框架下研究这三个稀疏探索问题。我们表明,在Frobenius范数下这些问题的收敛速度为(s(n) log p(n)/n)^(1/2)阶,其中s(n)是非零元素的数量,p(n)是协方差矩阵的大小,n是样本量。这明确说明了高维性的贡献仅仅是一个对数因子。已经明确给出了调整参数λ(n)趋于0的速度条件,并在不同惩罚下进行了比较。结果表明,对于L(1)惩罚,为了保证稀疏一致性和最优收敛速度,在估计稀疏协方差或相关矩阵、稀疏精度或逆相关矩阵或稀疏Cholesky因子时,在O(pn^2)个参数中,非零元素的数量应该最多为sn' = O(pn),其中sn'是对角线外元素上非零元素的数量。另一方面,使用SCAD或硬阈值惩罚函数则没有这样的限制。

相似文献

2
Sparse estimation of a covariance matrix.协方差矩阵的稀疏估计。
Biometrika. 2011 Dec;98(4):807-820. doi: 10.1093/biomet/asr054.
5
Optimal Estimation and Rank Detection for Sparse Spiked Covariance Matrices.稀疏尖峰协方差矩阵的最优估计与秩检测
Probab Theory Relat Fields. 2015 Apr 1;161(3-4):781-815. doi: 10.1007/s00440-014-0562-z.
10
Joint Estimation of Precision Matrices in Heterogeneous Populations.异质群体中精度矩阵的联合估计
Electron J Stat. 2016;10(1):1341-1392. doi: 10.1214/16-EJS1137. Epub 2016 May 31.

引用本文的文献

8
Covariance estimation via fiducial inference.基于置信推断的协方差估计。
Stat Theory Relat Fields. 2021;5(4):316-331. doi: 10.1080/24754269.2021.1877950. Epub 2021 Feb 15.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验