Zhang Shuqin, Zhao Hongyu
Center for Computational Systems Biology, School of Mathematical Sciences, Fudan University, Shanghai 200433, China.
Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut 06520, USA.
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Nov;88(5):052802. doi: 10.1103/PhysRevE.88.052802. Epub 2013 Nov 4.
As a fundamental problem in network study, community identification has attracted much attention from different fields. Representing a seminal work in this area, the modularity optimization method has been widely applied and studied. However, this method has issues in resolution limit and extreme degeneracy and may not perform well for networks with unbalanced structures. Although several methods have been proposed to overcome these limitations, they are all based on the original idea of defining modularity through comparing the total number of edges within the putative communities in the observed network with that in an equivalent randomly generated network. In this paper, we show that this modularity definition is not suitable to analyze some networks such as those with unbalanced structures. Instead, we propose to define modularity through the average degree within the communities and formulate modularity as comparing the sum of average degree within communities of the observed network to that of an equivalent randomly generated network. In addition, we also propose a degree-adjusted approach for further improvement when there are unbalanced structures. We analyze the theoretical properties of our degree adjusted method. Numerical experiments for both artificial networks and real networks demonstrate that average degree plays an important role in network community identification, and our proposed methods have better performance than existing ones.
作为网络研究中的一个基本问题,社区识别吸引了不同领域的广泛关注。作为该领域的一项开创性工作,模块化优化方法已得到广泛应用和研究。然而,该方法存在分辨率极限和极端退化问题,对于结构不平衡的网络可能表现不佳。尽管已经提出了几种方法来克服这些限制,但它们都基于通过比较观察到的网络中假定社区内的边总数与等效随机生成网络中的边总数来定义模块化的原始思想。在本文中,我们表明这种模块化定义不适用于分析某些网络,例如结构不平衡的网络。相反,我们建议通过社区内的平均度来定义模块化,并将模块化表述为比较观察到的网络中社区内平均度的总和与等效随机生成网络中的总和。此外,当存在结构不平衡时,我们还提出了一种度调整方法以进一步改进。我们分析了我们的度调整方法的理论性质。对人工网络和真实网络的数值实验表明,平均度在网络社区识别中起着重要作用,并且我们提出的方法比现有方法具有更好的性能。