Bien Jacob, Bunea Florentina, Xiao Luo
Cornell University, Department of Biological Statistics and Computational Biology, 1178 Comstock Hall, Cornell University, Ithaca, 14853 United States.
Cornell University, Statistical Science, Ithaca, 14850 United States.
J Am Stat Assoc. 2016;111(514):834-845. doi: 10.1080/01621459.2015.1058265. Epub 2016 Aug 18.
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
我们为变量具有已知顺序的高维模型引入了一种新的协方差矩阵稀疏估计器。我们的估计器是一个凸优化问题的解,等效地表示为一个通过Toeplitz稀疏带状数据自适应矩阵对样本协方差矩阵进行缩减的估计器。由于这种适应性,凸带状估计器具有先前的带状或缩减估计器所不具备的理论最优性质。特别是,我们的凸带状估计器在Frobenius范数和算子范数下,在对数因子范围内,对于常用的协方差矩阵类以及更一般的类,是极小极大速率自适应的。此外,当真实协方差恰好是带状时,它能正确恢复带宽。我们的凸公式允许一种简单有效的算法。实证研究证明了其实际有效性,并表明即使真实协方差矩阵仅接近带状矩阵,我们的精确带状估计器也能很好地工作,这证实了我们的理论结果。在准确性和速度方面,我们的方法优于所有现有方法。我们通过表明它可用于提高对录音进行分类的判别分析的性能,来说明凸带状估计器的实际优点。