Department of Mathematical Sciences, University of Texas at Dallas, Richardson, Texas, 75080, USA.
Chesapeake Biological Laboratory, University of Maryland Center for Environmental Science, Solomons, Maryland, 20688, USA.
Sci Rep. 2017 Jul 19;7(1):5807. doi: 10.1038/s41598-017-05885-x.
We propose a new method of nonparametric bootstrap to quantify estimation uncertainties in functions of network degree distribution in large ultra sparse networks. Both network degree distribution and network order are assumed to be unknown. The key idea is based on adaptation of the "blocking" argument, developed for bootstrapping of time series and re-tiling of spatial data, to random networks. We first sample a set of multiple ego networks of varying orders that form a patch, or a network block analogue, and then resample the data within patches. To select an optimal patch size, we develop a new computationally efficient and data-driven cross-validation algorithm. The proposed fast patchwork bootstrap (FPB) methodology further extends the ideas for a case of network mean degree, to inference on a degree distribution. In addition, the FPB is substantially less computationally expensive, requires less information on a graph, and is free from nuisance parameters. In our simulation study, we show that the new bootstrap method outperforms competing approaches by providing sharper and better-calibrated confidence intervals for functions of a network degree distribution than other available approaches, including the cases of networks in an ultra sparse regime. We illustrate the FPB in application to collaboration networks in statistics and computer science and to Wikipedia networks.
我们提出了一种新的非参数自举方法,用于量化大型超稀疏网络中网络度分布函数的估计不确定性。假设网络度分布和网络阶数均未知。该方法的关键思想基于对“阻塞”论点的适应性,该论点是为时间序列的自举和空间数据的重新平铺而开发的,适用于随机网络。我们首先对一组具有不同阶数的多个自我网络进行抽样,这些自我网络形成一个斑块,或类似网络块,然后在斑块内重新采样数据。为了选择最佳的斑块大小,我们开发了一种新的计算高效和数据驱动的交叉验证算法。所提出的快速斑块自举(FPB)方法进一步扩展了网络平均度的思想,以对度分布进行推断。此外,FPB 的计算成本显著降低,对图的信息要求更少,并且没有干扰参数。在我们的模拟研究中,我们表明,新的自举方法通过提供比其他可用方法更尖锐和更好校准的网络度分布函数的置信区间来超越竞争方法,包括在超稀疏网络的情况下。我们将 FPB 应用于统计学和计算机科学中的合作网络以及 Wikipedia 网络中进行说明。