Fan Jianqing, Fan Yingying, Han Xiao, Lv Jinchi
Princeton University.
University of Southern California.
J Am Stat Assoc. 2022;117(538):996-1009. doi: 10.1080/01621459.2020.1840990. Epub 2020 Dec 8.
Characterizing the asymptotic distributions of eigenvectors for large random matrices poses important challenges yet can provide useful insights into a range of statistical applications. To this end, in this paper we introduce a general framework of asymptotic theory of eigenvectors (ATE) for large spiked random matrices with diverging spikes and heterogeneous variances, and establish the asymptotic properties of the spiked eigenvectors and eigenvalues for the scenario of the generalized Wigner matrix noise. Under some mild regularity conditions, we provide the asymptotic expansions for the spiked eigenvalues and show that they are asymptotically normal after some normalization. For the spiked eigenvectors, we establish asymptotic expansions for the general linear combination and further show that it is asymptotically normal after some normalization, where the weight vector can be arbitrary. We also provide a more general asymptotic theory for the spiked eigenvectors using the bilinear form. Simulation studies verify the validity of our new theoretical results. Our family of models encompasses many popularly used ones such as the stochastic block models with or without overlapping communities for network analysis and the topic models for text analysis, and our general theory can be exploited for statistical inference in these large-scale applications.
刻画大型随机矩阵特征向量的渐近分布面临着重大挑战,但能为一系列统计应用提供有用的见解。为此,在本文中,我们针对具有发散尖峰和异质方差的大型尖峰随机矩阵,引入了特征向量渐近理论(ATE)的一般框架,并建立了广义维格纳矩阵噪声情形下尖峰特征向量和特征值的渐近性质。在一些温和的正则条件下,我们给出了尖峰特征值的渐近展开式,并表明经过一些归一化后它们渐近服从正态分布。对于尖峰特征向量,我们建立了一般线性组合的渐近展开式,并进一步表明经过一些归一化后它渐近服从正态分布,其中权重向量可以是任意的。我们还使用双线性形式为尖峰特征向量提供了更一般的渐近理论。模拟研究验证了我们新理论结果的有效性。我们的模型族涵盖了许多常用的模型,例如用于网络分析的有或无重叠社区的随机块模型以及用于文本分析的主题模型,并且我们的一般理论可用于这些大规模应用中的统计推断。