Larremore Daniel B, Clauset Aaron, Jacobs Abigail Z
Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, Massachusetts 02115, USA and Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA and Santa Fe Institute, Santa Fe, New Mexico 87501, USA and BioFrontiers Institute, University of Colorado, Boulder, Colorado 80303, USA.
Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Jul;90(1):012805. doi: 10.1103/PhysRevE.90.012805. Epub 2014 Jul 10.
Bipartite networks are a common type of network data in which there are two types of vertices, and only vertices of different types can be connected. While bipartite networks exhibit community structure like their unipartite counterparts, existing approaches to bipartite community detection have drawbacks, including implicit parameter choices, loss of information through one-mode projections, and lack of interpretability. Here we solve the community detection problem for bipartite networks by formulating a bipartite stochastic block model, which explicitly includes vertex type information and may be trivially extended to k-partite networks. This bipartite stochastic block model yields a projection-free and statistically principled method for community detection that makes clear assumptions and parameter choices and yields interpretable results. We demonstrate this model's ability to efficiently and accurately find community structure in synthetic bipartite networks with known structure and in real-world bipartite networks with unknown structure, and we characterize its performance in practical contexts.
二分网络是一种常见的网络数据类型,其中存在两种类型的顶点,并且只有不同类型的顶点才能相连。虽然二分网络与单分网络一样呈现出社区结构,但现有的二分社区检测方法存在缺陷,包括隐含的参数选择、通过单模投影导致的信息丢失以及缺乏可解释性。在这里,我们通过构建一个二分随机块模型来解决二分网络的社区检测问题,该模型明确包含顶点类型信息,并且可以很容易地扩展到k分网络。这个二分随机块模型产生了一种无投影且基于统计原则的社区检测方法,该方法做出了明确的假设和参数选择,并产生可解释的结果。我们展示了该模型在具有已知结构的合成二分网络和具有未知结构的真实世界二分网络中高效准确地找到社区结构的能力,并在实际环境中刻画了其性能。