Berry Jonathan W, Hendrickson Bruce, LaViolette Randall A, Phillips Cynthia A
Sandia National Laboratories, P.O. Box 5800, Albuquerque, New Mexico 87185, USA.
Phys Rev E Stat Nonlin Soft Matter Phys. 2011 May;83(5 Pt 2):056119. doi: 10.1103/PhysRevE.83.056119. Epub 2011 May 25.
Communities of vertices within a giant network such as the World Wide Web are likely to be vastly smaller than the network itself. However, Fortunato and Barthélemy have proved that modularity maximization algorithms for community detection may fail to resolve communities with fewer than √L/2 edges, where L is the number of edges in the entire network. This resolution limit leads modularity maximization algorithms to have notoriously poor accuracy on many real networks. Fortunato and Barthélemy's argument can be extended to networks with weighted edges as well, and we derive this corollary argument. We conclude that weighted modularity algorithms may fail to resolve communities with less than √Wε/2 total edge weight, where W is the total edge weight in the network and ε is the maximum weight of an intercommunity edge. If ε is small, then small communities can be resolved. Given a weighted or unweighted network, we describe how to derive new edge weights in order to achieve a low ε, we modify the Clauset, Newman, and Moore (CNM) community detection algorithm to maximize weighted modularity, and we show that the resulting algorithm has greatly improved accuracy. In experiments with an emerging community standard benchmark, we find that our simple CNM variant is competitive with the most accurate community detection methods yet proposed.
在诸如万维网这样的巨型网络中,顶点社区可能比网络本身小得多。然而,福尔图纳托和巴塞勒米证明,用于社区检测的模块化最大化算法可能无法解析边数少于√L/2的社区,其中L是整个网络中的边数。这种分辨率限制导致模块化最大化算法在许多真实网络上的准确性 notoriously poor。福尔图纳托和巴塞勒米的论点也可以扩展到具有加权边的网络,我们推导了这个推论论点。我们得出结论,加权模块化算法可能无法解析总边权小于√Wε/2的社区,其中W是网络中的总边权,ε是社区间边的最大权值。如果ε很小,那么小社区就可以被解析。给定一个加权或未加权的网络,我们描述了如何推导新的边权值以实现低ε,我们修改了克劳塞特、纽曼和摩尔(CNM)社区检测算法以最大化加权模块化,并且我们表明所得算法的准确性有了很大提高。在使用新兴社区标准基准的实验中,我们发现我们简单的CNM变体与迄今提出的最准确的社区检测方法具有竞争力。