Computer Science Department, Stanford University, 353 Serra Mall, Stanford, CA, 94305, USA.
Chan Zuckerberg Biohub, 499 Illinois St., San Francisco, CA, 94158, USA.
Nat Commun. 2018 Jun 29;9(1):2544. doi: 10.1038/s41467-018-04948-5.
Uncovering modular structure in networks is fundamental for systems in biology, physics, and engineering. Community detection identifies candidate modules as hypotheses, which then need to be validated through experiments, such as mutagenesis in a biological laboratory. Only a few communities can typically be validated, and it is thus important to prioritize which communities to select for downstream experimentation. Here we develop CRANK, a mathematically principled approach for prioritizing network communities. CRANK efficiently evaluates robustness and magnitude of structural features of each community and then combines these features into the community prioritization. CRANK can be used with any community detection method. It needs only information provided by the network structure and does not require any additional metadata or labels. However, when available, CRANK can incorporate domain-specific information to further boost performance. Experiments on many large networks show that CRANK effectively prioritizes communities, yielding a nearly 50-fold improvement in community prioritization.
揭示网络中的模块化结构对于生物学、物理学和工程学中的系统至关重要。社区检测将候选模块识别为假设,然后需要通过实验来验证,例如在生物实验室中进行诱变。通常只能验证少数几个社区,因此重要的是要确定哪些社区优先进行下游实验。在这里,我们开发了 CRANK,这是一种用于对网络社区进行优先级排序的数学原理方法。CRANK 可以有效地评估每个社区的结构特征的稳健性和幅度,然后将这些特征组合到社区优先级排序中。CRANK 可以与任何社区检测方法一起使用。它只需要网络结构提供的信息,不需要任何其他元数据或标签。但是,当有可用信息时,CRANK 可以利用领域特定信息来进一步提高性能。在许多大型网络上的实验表明,CRANK 可以有效地对社区进行优先级排序,从而使社区优先级排序的效果提高近 50 倍。