Mandala Supreet, Kumara Soundar, Yao Tao
Industrial Engineering Department, Pennsylvania State University, University Park, PA 16802, USA.
Phys Rev E Stat Nonlin Soft Matter Phys. 2012 Jul;86(1 Pt 2):016111. doi: 10.1103/PhysRevE.86.016111. Epub 2012 Jul 23.
The problem of graph clustering or community detection has enjoyed a lot of attention in complex networks literature. A quality function, modularity, quantifies the strength of clustering and on maximization yields sensible partitions. However, in most real world networks, there are an exponentially large number of near-optimal partitions with some being very different from each other. Therefore, picking an optimal clustering among the alternatives does not provide complete information about network topology. To tackle this problem, we propose a graph perturbation scheme which can be used to identify an ensemble of near-optimal and diverse clusterings. We establish analytical properties of modularity function under the perturbation which ensures diversity. Our approach is algorithm independent and therefore can leverage any of the existing modularity maximizing algorithms. We numerically show that our methodology can systematically identify very different partitions on several existing data sets. The knowledge of diverse partitions sheds more light into the topological organization and helps gain a more complete understanding of the underlying complex network.
图聚类或社区检测问题在复杂网络文献中备受关注。一个质量函数,即模块度,量化了聚类的强度,通过最大化模块度可得到合理的划分。然而,在大多数现实世界网络中,存在指数级数量的近似最优划分,其中一些彼此差异很大。因此,在这些备选划分中挑选出一个最优聚类并不能提供关于网络拓扑的完整信息。为了解决这个问题,我们提出了一种图扰动方案,该方案可用于识别一组近似最优且多样的聚类。我们建立了扰动下模块度函数的分析性质,以确保多样性。我们的方法与算法无关,因此可以利用任何现有的最大化模块度算法。我们通过数值表明,我们的方法可以系统地在几个现有数据集上识别出非常不同的划分。多样划分的知识能更深入地揭示拓扑结构,并有助于更全面地理解底层复杂网络。