Wang Cheng, Butts Carter T, Hipp John R, Jose Rupa, Lakon Cynthia M
Department of Sociology, University of Notre Dame.
Departments of Sociology and Statistics, University of California, Irvine.
Soc Networks. 2016 Mar 1;45:89-98. doi: 10.1016/j.socnet.2015.12.003.
Recent developments have made model-based imputation of network data feasible in principle, but the extant literature provides few practical examples of its use. In this paper we consider 14 schools from the widely used In-School Survey of Add Health (Harris et al., 2009), applying an ERGM-based estimation and simulation approach to impute the network missing data for each school. Add Health's complex study design leads to multiple types of missingness, and we introduce practical techniques for handing each. We also develop a cross-validation based method - Held-Out Predictive Evaluation (HOPE) - for assessing this approach. Our results suggest that ERGM-based imputation of edge variables is a viable approach to the analysis of complex studies such as Add Health, provided that care is used in understanding and accounting for the study design.
近期的发展已使基于模型的网络数据插补在原则上可行,但现有文献中几乎没有其实际应用的例子。在本文中,我们考虑了广泛使用的“青少年健康纵向研究”(Add Health)校内调查中的14所学校(哈里斯等人,2009年),应用基于指数随机图模型(ERGM)的估计和模拟方法来插补每所学校网络中的缺失数据。Add Health复杂的研究设计导致了多种类型的缺失,我们介绍了处理每种缺失的实用技术。我们还开发了一种基于交叉验证的方法——留出预测评估(HOPE)——来评估这种方法。我们的结果表明,只要在理解和考虑研究设计时谨慎行事,基于ERGM的边变量插补是分析像Add Health这样的复杂研究的可行方法。