IEEE Trans Cybern. 2019 Feb;49(2):626-637. doi: 10.1109/TCYB.2017.2783325. Epub 2018 Jan 3.
Stochastic block models (SBMs) have been playing an important role in modeling clusters or community structures of network data. But, it is incapable of handling several complex features ubiquitously exhibited in real-world networks, one of which is the power-law degree characteristic. To this end, we propose a new variant of SBM, termed power-law degree SBM (PLD-SBM), by introducing degree decay variables to explicitly encode the varying degree distribution over all nodes. With an exponential prior, it is proved that PLD-SBM approximately preserves the scale-free feature in real networks. In addition, from the inference of variational E-Step, PLD-SBM is indeed to correct the bias inherited in SBM with the introduced degree decay factors. Furthermore, experiments conducted on both synthetic networks and two real-world datasets including Adolescent Health Data and the political blogs network verify the effectiveness of the proposed model in terms of cluster prediction accuracies.
随机块模型(SBM)在对网络数据的聚类或社区结构进行建模方面发挥了重要作用。但是,它无法处理现实世界网络中普遍存在的几个复杂特征,其中之一是幂律度特征。为此,我们通过引入度衰减变量来显式编码所有节点的变化度分布,提出了 SBM 的一种新变体,称为幂律度 SBM(PLD-SBM)。通过指数先验,证明了 PLD-SBM 可以近似保留真实网络中的无标度特征。此外,通过变分 E-步的推断,PLD-SBM 确实可以通过引入的度衰减因子来纠正 SBM 中固有的偏差。此外,在合成网络和两个包括青少年健康数据和政治博客网络在内的真实数据集上进行的实验验证了所提出模型在聚类预测精度方面的有效性。