Peng Siyun, Roth Adam R, Perry Brea L
Department of Sociology & Network Science Institute, Indiana University, USA.
Soc Networks. 2023 Jan;72:52-58. doi: 10.1016/j.socnet.2022.09.004. Epub 2022 Sep 13.
The social network perspective has great potential for advancing knowledge of social mechanisms in many fields. However, collecting egocentric (i.e., personal) network data is costly and places a heavy burden on respondents. This is especially true of the task used to elicit information on ties between network members (i.e., alter-alter ties or density matrix), which grows exponentially in length as network size increases. While most existing national surveys circumvent this problem by capping the number of network members that can be named, this strategy has major limitations. Here, we apply random sampling of network members to reduce cost, respondent burden, and error in network studies. We examine the effectiveness and reliability of random sampling in simulated and real-world egocentric network data. We find that in estimating sample/population means of network measures, randomly selecting a small number of network members produces only minor errors, regardless of true network size. For studies that use network measures in regressions, randomly selecting the mean number of network members (e.g., randomly selecting 10 alters when mean network size is 10) is enough to recover estimates of network measures that correlate close to 1 with those of the full sample. We conclude with recommendations for best practices that will make this versatile but resource intensive methodology accessible to a wider group of researchers without sacrificing data quality.
社会网络视角在推进许多领域的社会机制知识方面具有巨大潜力。然而,收集以自我为中心(即个人)的网络数据成本高昂,且给受访者带来沉重负担。在用于获取网络成员之间关系信息的任务(即他者-他者关系或密度矩阵)中尤其如此,随着网络规模的增加,该任务的长度呈指数增长。虽然大多数现有的全国性调查通过限制可提及的网络成员数量来规避这个问题,但这种策略有很大局限性。在此,我们应用网络成员的随机抽样来降低成本、减轻受访者负担并减少网络研究中的误差。我们在模拟和现实世界的以自我为中心的网络数据中检验随机抽样的有效性和可靠性。我们发现,在估计网络指标的样本/总体均值时,无论真实网络规模如何,随机选择少量网络成员只会产生微小误差。对于在回归中使用网络指标的研究,随机选择网络成员的平均数(例如,当平均网络规模为10时随机选择10个他者)足以恢复与全样本网络指标相关性接近1的网络指标估计值。我们最后给出最佳实践建议,以使这种通用但资源密集型的方法能够被更广泛的研究人员使用,同时不牺牲数据质量。