Hoff Peter, Fosdick Bailey, Volfovsky Alex, Stovel Katherine
Departments of Statistics and Biostatistics, University of Washington, Seattle, WA 98195, USA.
Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, NC 27709, USA.
Netw Sci (Camb Univ Press). 2013 Dec 1;1(3):253-277. doi: 10.1017/nws.2013.17.
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design.
许多收集社交网络数据的研究采用的调查方法会导致信息被审查、缺失或不完整。例如,常用于学校和企业研究的流行的固定排名提名(FRN)方案,要求研究参与者最多提名并排名少量的联系人或朋友,使得其他关系的存在情况不明。然而,大多数统计模型是根据完全观测到的二元网络构建的。用此类模型对FRN数据进行统计分析时,会忽略数据的审查和排名性质,可能会导致误导性的统计推断。为了研究这种可能性,我们将从完全二元网络的似然性中获得的贝叶斯参数估计值与从源自FRN方案的似然性中获得的估计值进行比较,后者考虑了数据的排名和审查性质。我们通过分析和模拟表明,二元似然性可能会提供误导性推断,特别是对于某些将网络关系与个体及个体对的特征相关联的模型参数。我们还在对几个青少年社交网络的数据分析中比较了这些不同的似然性。对于其中一些网络,二元似然性和FRN似然性的参数估计得出了不同的结论,这表明使用考虑FRN调查设计的方法来分析FRN数据很重要。