Department of Economics and Consortium for Data Analytics in Risk, University of California, Berkeley, CA 94720.
Aperio by BlackRock, Sausalito, CA 94965.
Proc Natl Acad Sci U S A. 2023 Jan 10;120(2):e2207046120. doi: 10.1073/pnas.2207046120. Epub 2023 Jan 5.
Recent research identifies and corrects bias, such as excess dispersion, in the leading sample eigenvector of a factor-based covariance matrix estimated from a high-dimension low sample size (HL) data set. We show that eigenvector bias can have a substantial impact on variance-minimizing optimization in the HL regime, while bias in estimated eigenvalues may have little effect. We describe a data-driven eigenvector shrinkage estimator in the HL regime called "James-Stein for eigenvectors" (JSE) and its close relationship with the James-Stein (JS) estimator for a collection of averages. We show, both theoretically and with numerical experiments, that, for certain variance-minimizing problems of practical importance, efforts to correct eigenvalues have little value in comparison to the JSE correction of the leading eigenvector. When certain extra information is present, JSE is a consistent estimator of the leading eigenvector.
最近的研究确定并纠正了偏倚,例如在从高维小样本量 (HL) 数据集估计的基于因子的协方差矩阵的主导样本特征向量中存在的过度分散。我们表明,特征向量偏差在 HL 情况下的最小方差优化中会产生重大影响,而估计特征值中的偏差可能影响很小。我们在 HL 情况下描述了一种数据驱动的特征向量收缩估计器,称为“特征向量的 James-Stein 收缩”(JSE),并展示了它与用于平均值集合的 James-Stein (JS) 估计器的密切关系。我们从理论和数值实验两个方面表明,对于某些具有实际重要性的最小方差问题,与 JSE 对主导特征向量的修正相比,修正特征值的效果很小。当存在某些额外信息时,JSE 是主导特征向量的一致估计量。