Clauset Aaron
Department of Computer Science, University of New Mexico, Albuquerque, New Mexico 87131, USA.
Phys Rev E Stat Nonlin Soft Matter Phys. 2005 Aug;72(2 Pt 2):026132. doi: 10.1103/PhysRevE.72.026132. Epub 2005 Aug 29.
Although the inference of global community structure in networks has recently become a topic of great interest in the physics community, all such algorithms require that the graph be completely known. Here, we define both a measure of local community structure and an algorithm that infers the hierarchy of communities that enclose a given vertex by exploring the graph one vertex at a time. This algorithm runs in time O(k2d) for general graphs when d is the mean degree and k is the number of vertices to be explored. For graphs where exploring a new vertex is time consuming, the running time is linear, O(k). We show that on computer-generated graphs the average behavior of this technique approximates that of algorithms that require global knowledge. As an application, we use this algorithm to extract meaningful local clustering information in the large recommender network of an online retailer.
尽管网络中全球社区结构的推断最近已成为物理学界非常感兴趣的一个话题,但所有此类算法都要求图是完全已知的。在这里,我们定义了一种局部社区结构的度量以及一种算法,该算法通过一次探索一个顶点来推断包围给定顶点的社区层次结构。对于一般图,当d是平均度数且k是要探索的顶点数时,该算法的运行时间为O(k2d)。对于探索新顶点耗时的图,运行时间是线性的,即O(k)。我们表明,在计算机生成的图上,该技术的平均行为近似于需要全局知识的算法的行为。作为一个应用,我们使用该算法在一家在线零售商的大型推荐网络中提取有意义的局部聚类信息。