Abbe Emmanuel, Fan Jianqing, Wang Kaizheng, Zhong Yiqiao
PACM and Department of EE, Princeton University, Princeton, NJ 08544, USA.
Department of ORFE, Princeton University, Princeton, NJ 08544, USA.
Ann Stat. 2020 Jun;48(3):1452-1474. doi: 10.1214/19-aos1854. Epub 2020 Jul 17.
Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the norm: where { } and are eigenvectors of a random matrix and its expectation , respectively. The fact that the approximation is both tight and linear in facilitates sharp comparisons between and . In particular, it allows for comparing the signs of and even if is large. The results are further extended to perturbations of eigenspaces, yielding new -type bounds for synchronization ( -spiked Wigner model) and noisy matrix completion.
通过特征向量扰动分析恢复低秩结构是统计机器学习中的一个常见问题,例如在因子分析、社区检测、排序、矩阵补全等等。虽然对于特征向量的经验统计量和总体统计量之间的平均误差有各种各样的界,但对于逐元素分析来说,很少有结果是紧密的,而逐元素分析对于诸如社区检测等许多问题至关重要。本文研究了一类期望为低秩的随机矩阵特征向量的逐元素行为,这有助于解决阿贝等人(2014b)的猜想,即在随机块模型中,谱算法无需任何修剪或清理步骤就能实现精确恢复。关键在于在 范数下特征向量的一阶近似:其中{ }和 分别是随机矩阵 及其期望 的特征向量。该近似在 中既是紧密的又是线性的这一事实便于对 和 进行精确比较。特别地,即使 很大,它也允许比较 和 的符号。结果进一步扩展到特征子空间的扰动,为同步( -尖峰维格纳模型)和噪声矩阵补全产生了新的 -型界。