指数族随机图模型的最大伪似然估计与最大似然估计比较框架

A Framework for the Comparison of Maximum Pseudo Likelihood and Maximum Likelihood Estimation of Exponential Family Random Graph Models.

作者信息

van Duijn Marijtje A J, Gile Krista J, Handcock Mark S

机构信息

Department of Sociology, University of Groningen, Grote Rozenstraat 31, 9712 TG Groningen, The Netherlands.

出版信息

Soc Networks. 2009 Jan;31(1):52-62. doi: 10.1016/j.socnet.2008.10.003.

Abstract

The statistical modeling of social network data is difficult due to the complex dependence structure of the tie variables. Statistical exponential families of distributions provide a flexible way to model such dependence. They enable the statistical characteristics of the network to be encapsulated within an exponential family random graph (ERG) model. For a long time, however, likelihood-based estimation was only feasible for ERG models assuming dyad independence. For more realistic and complex models inference has been based on the pseudo-likelihood. Recent advances in computational methods have made likelihood-based inference practical, and comparison of the different estimators possible.In this paper, we present methodology to enable estimators of ERG model parameters to be compared. We use this methodology to compare the bias, standard errors, coverage rates and efficiency of maximum likelihood and maximum pseudo-likelihood estimators. We also propose an improved pseudo-likelihood estimation method aimed at reducing bias. The comparison is performed using simulated social network data based on two versions of an empirically realistic network model, the first representing Lazega's law firm data and the second a modified version with increased transitivity. The framework considers estimation of both the natural and the mean-value parameters.The results clearly show the superiority of the likelihood-based estimators over those based on pseudo-likelihood, with the bias-reduced pseudo-likelihood out-performing the general pseudo-likelihood. The use of the mean value parameterization provides insight into the differences between the estimators and when these differences will matter in practice.

摘要

由于关系变量的复杂依赖结构,社交网络数据的统计建模具有一定难度。统计指数族分布提供了一种灵活的方式来对这种依赖关系进行建模。它们使得网络的统计特征能够被封装在指数族随机图(ERG)模型中。然而,长期以来,基于似然的估计仅适用于假设二元独立性的ERG模型。对于更现实、更复杂的模型,推断一直基于伪似然。计算方法的最新进展使得基于似然的推断变得可行,并且不同估计量之间的比较也成为可能。在本文中,我们提出了一种方法,用于比较ERG模型参数的估计量。我们使用这种方法来比较最大似然估计量和最大伪似然估计量的偏差、标准误差、覆盖率和效率。我们还提出了一种旨在减少偏差的改进伪似然估计方法。比较是使用基于经验现实网络模型的两个版本的模拟社交网络数据进行的,第一个版本代表拉泽加律师事务所的数据,第二个版本是具有更高传递性的修改版本。该框架考虑了自然参数和均值参数的估计。结果清楚地表明,基于似然的估计量优于基于伪似然的估计量,偏差减少的伪似然估计量优于一般的伪似然估计量。使用均值参数化能够深入了解估计量之间的差异以及这些差异在实际中何时会产生影响。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索