Choi D S, Wolfe P J, Airoldi E M
School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, U.S.A. ,
Biometrika. 2012 Jun;99(2):273-284. doi: 10.1093/biomet/asr053. Epub 2012 Apr 17.
We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising self-reported school friendships, resulting in block estimates that reveal residual structure.
我们给出了关于使用随机块模型分析网络数据的渐近和有限样本结果。我们表明,在最大似然拟合下,当类别数量随着网络规模的平方根增长且平均网络度至少以网络规模的多对数形式增长时,误分类网络节点的比例依概率收敛到零。我们还为包含独立伯努利随机变量的数据建立了最大似然块模型参数估计的有限样本置信界;这些结果在类别分配上是一致成立的。我们提供了模拟,验证了足以得出我们结果的条件,并通过将具有协变量的随机块模型的逻辑参数化拟合到一个包含自我报告的学校友谊的网络数据示例来得出结论,得到的块估计揭示了残余结构。