Hall-Swan Sarah, Crawford Jake, Newman Rebecca, Cowen Lenore J
Department of Computer Science, Tufts University, Medford, 02155, MA, USA.
BMC Syst Biol. 2018 Mar 21;12(Suppl 3):24. doi: 10.1186/s12918-018-0550-5.
Decomposing a protein-protein interaction network (PPI network) into non-overlapping clusters or communities, sometimes called "network modules," is an important way to explore functional roles of sets of genes. When the method to accomplish this decomposition is solely based on purely graph-theoretic measures of the interconnection structure of the network, this is often called unsupervised clustering or community detection. In this study, we compare unsupervised computational methods for decomposing a PPI network into non-overlapping modules. A method is preferred if it results in a large proportion of nodes being assigned to functionally meaningful modules, as measured by functional enrichment over terms from the Gene Ontology (GO).
We compare the performance of three popular community detection algorithms with the same algorithms run after the network is pre-processed by removing and reweighting based on the diffusion state distance (DSD) between pairs of nodes in the network. We call this "detangling" the network. In almost all cases, we find that detangling the network based on the DSD distance reweighting provides more meaningful clusters.
Re-embedding using the DSD distance metric, before applying standard community detection algorithms, can assist in uncovering GO functionally enriched clusters in the yeast PPI network.
将蛋白质 - 蛋白质相互作用网络(PPI网络)分解为不重叠的簇或群落,有时称为“网络模块”,是探索基因集功能作用的重要方法。当完成这种分解的方法仅基于网络互连结构的纯图论度量时,这通常称为无监督聚类或群落检测。在本研究中,我们比较了将PPI网络分解为不重叠模块的无监督计算方法。如果一种方法能使很大比例的节点被分配到功能上有意义的模块中,那么这种方法是更可取的,功能意义通过基因本体论(GO)术语的功能富集来衡量。
我们比较了三种流行的群落检测算法的性能,以及在基于网络中节点对之间的扩散状态距离(DSD)进行去除和重新加权预处理后的相同算法的性能。我们将此称为网络“解缠”。在几乎所有情况下,我们发现基于DSD距离重新加权对网络进行解缠能提供更有意义的簇。
在应用标准群落检测算法之前,使用DSD距离度量进行重新嵌入,有助于在酵母PPI网络中发现功能富集的GO簇。