Suppr超能文献

在超大型网络中寻找社区结构。

Finding community structure in very large networks.

作者信息

Clauset Aaron, Newman M E J, Moore Cristopher

机构信息

Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA.

出版信息

Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Dec;70(6 Pt 2):066111. doi: 10.1103/PhysRevE.70.066111. Epub 2004 Dec 6.

Abstract

The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O (md log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m approximately n and d approximately log n, in which case our algorithm runs in essentially linear time, O (n log(2) n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2 x 10(6) edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

摘要

网络中社区结构的发现与分析是近期物理学界相当感兴趣的一个话题,但迄今为止提出的大多数方法由于计算成本过高而不适用于非常大的网络。在此,我们提出一种用于检测社区结构的层次凝聚算法,它比许多竞争算法都要快:在一个具有(n)个顶点和(m)条边的网络上,其运行时间为(O(md\log n)),其中(d)是描述社区结构的树状图的深度。许多现实世界的网络是稀疏且分层的,(m)近似于(n)且(d)近似于(\log n),在这种情况下我们的算法基本上以线性时间(O(n\log^2 n))运行。作为该算法应用的一个例子,我们用它来分析一家大型在线零售商网站上待售商品的网络,如果商品经常被同一买家购买,那么网络中的这些商品就会被链接起来。该网络有超过40万个顶点和(2\times10^6)条边。我们表明我们的算法能够从这个网络中提取有意义的社区,揭示出客户购买习惯中存在的大规模模式。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验