Won Joong-Ho, Lim Johan, Kim Seung-Jean, Rajaratnam Bala
School of Industrial Management Engineering, Korea University, Seoul, Korea.
J R Stat Soc Series B Stat Methodol. 2013 Jun 1;75(3):427-450. doi: 10.1111/j.1467-9868.2012.01049.x.
Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the "large small " setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required.
估计高维协方差矩阵是一个公认的难题,它有许多应用,并且是当前广大统计学界关注的焦点。在包括所谓的“大 小”设置在内的许多应用中,协方差矩阵的估计不仅需要可逆,还需要具有良好的条件数。尽管许多正则化方案试图做到这一点,但它们都没有直接解决病态问题。在本文中,我们提出了一种最大似然方法,其直接目标是获得一个条件良好的估计器。我们没有对协方差矩阵及其逆矩阵施加稀疏性假设,因此使我们的方法更具广泛适用性。我们证明了所提出的正则化方案在计算上是高效的,产生了一种Steinian收缩估计器,并且具有自然的贝叶斯解释。我们全面研究了正则化协方差估计器的理论性质,包括其正则化路径,并进而开发了一种自适应确定所需正则化水平的方法。最后,我们在决策理论比较和金融投资组合优化设置中展示了正则化估计器的性能。所提出的方法具有理想的性质,并且可以作为一种有竞争力的方法,特别是在样本量较小且需要条件良好的估计器的情况下。