Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany.
PLoS Comput Biol. 2010 Jan 29;6(1):e1000659. doi: 10.1371/journal.pcbi.1000659.
It is widely believed that the modular organization of cellular function is reflected in a modular structure of molecular networks. A common view is that a "module" in a network is a cohesively linked group of nodes, densely connected internally and sparsely interacting with the rest of the network. Many algorithms try to identify functional modules in protein-interaction networks (PIN) by searching for such cohesive groups of proteins. Here, we present an alternative approach independent of any prior definition of what actually constitutes a "module". In a self-consistent manner, proteins are grouped into "functional roles" if they interact in similar ways with other proteins according to their functional roles. Such grouping may well result in cohesive modules again, but only if the network structure actually supports this. We applied our method to the PIN from the Human Protein Reference Database (HPRD) and found that a representation of the network in terms of cohesive modules, at least on a global scale, does not optimally represent the network's structure because it focuses on finding independent groups of proteins. In contrast, a decomposition into functional roles is able to depict the structure much better as it also takes into account the interdependencies between roles and even allows groupings based on the absence of interactions between proteins in the same functional role. This, for example, is the case for transmembrane proteins, which could never be recognized as a cohesive group of nodes in a PIN. When mapping experimental methods onto the groups, we identified profound differences in the coverage suggesting that our method is able to capture experimental bias in the data, too. For example yeast-two-hybrid data were highly overrepresented in one particular group. Thus, there is more structure in protein-interaction networks than cohesive modules alone and we believe this finding can significantly improve automated function prediction algorithms.
人们普遍认为,细胞功能的模块化组织反映在分子网络的模块化结构中。一种常见的观点是,网络中的“模块”是一组紧密连接的节点,内部连接紧密,与网络的其他部分稀疏交互。许多算法试图通过搜索这种有凝聚力的蛋白质组来识别蛋白质相互作用网络 (PIN) 中的功能模块。在这里,我们提出了一种不依赖于任何关于实际上构成“模块”的定义的替代方法。根据蛋白质的功能角色,如果它们根据其功能角色以相似的方式与其他蛋白质相互作用,则可以将蛋白质自动分组到“功能角色”中。这种分组很可能再次产生有凝聚力的模块,但前提是网络结构实际上支持这一点。我们将我们的方法应用于来自人类蛋白质参考数据库 (HPRD) 的 PIN,并发现,根据网络结构的实际情况,用有凝聚力的模块来表示网络(至少在全局范围内)并不能最佳地表示网络的结构,因为它专注于寻找独立的蛋白质组。相比之下,分解为功能角色能够更好地描绘结构,因为它还考虑了角色之间的相互依赖关系,甚至允许基于同一功能角色中蛋白质之间没有相互作用进行分组。例如,跨膜蛋白就是这种情况,在 PIN 中,它们永远不可能被识别为有凝聚力的节点组。当将实验方法映射到这些组时,我们发现了覆盖范围的深刻差异,这表明我们的方法也能够捕捉数据中的实验偏差。例如,酵母双杂交数据在一个特定的组中高度过表达。因此,蛋白质相互作用网络中的结构比有凝聚力的模块更复杂,我们相信这一发现可以显著改进自动化功能预测算法。