Department of Electronics and Information, Politecnico di Milano, Milano, Italy.
PLoS One. 2011;6(11):e27028. doi: 10.1371/journal.pone.0027028. Epub 2011 Nov 3.
Identifying communities (or clusters), namely groups of nodes with comparatively strong internal connectivity, is a fundamental task for deeply understanding the structure and function of a network. Yet, there is a lack of formal criteria for defining communities and for testing their significance. We propose a sharp definition that is based on a quality threshold. By means of a lumped Markov chain model of a random walker, a quality measure called "persistence probability" is associated to a cluster, which is then defined as an "α-community" if such a probability is not smaller than α. Consistently, a partition composed of α-communities is an "α-partition." These definitions turn out to be very effective for finding and testing communities. If a set of candidate partitions is available, setting the desired α-level allows one to immediately select the α-partition with the finest decomposition. Simultaneously, the persistence probabilities quantify the quality of each single community. Given its ability in individually assessing each single cluster, this approach can also disclose single well-defined communities even in networks that overall do not possess a definite clusterized structure.
识别社区(或簇),即具有相对较强内部连接性的节点组,是深入理解网络结构和功能的基本任务。然而,目前缺乏定义社区和测试其重要性的正式标准。我们提出了一个基于质量阈值的明确定义。通过随机游走的集中马尔可夫链模型,与簇相关联的质量度量称为“持久性概率”,如果该概率不小于 α,则将该簇定义为“α-社区”。相应地,如果分区由 α-社区组成,则该分区称为“α-分区”。这些定义对于发现和测试社区非常有效。如果有一组候选分区,则设置所需的α水平可以立即选择具有最佳分解的α-分区。同时,持久性概率量化了每个单个社区的质量。由于其能够单独评估每个单个集群的能力,因此即使在整体上不具有明确聚类结构的网络中,该方法也可以揭示单个明确界定的社区。