Department of Physics, University of Houston, Houston, Texas, United States of America.
PLoS Comput Biol. 2012;8(2):e1002391. doi: 10.1371/journal.pcbi.1002391. Epub 2012 Feb 23.
Determining the functional structure of biological networks is a central goal of systems biology. One approach is to analyze gene expression data to infer a network of gene interactions on the basis of their correlated responses to environmental and genetic perturbations. The inferred network can then be analyzed to identify functional communities. However, commonly used algorithms can yield unreliable results due to experimental noise, algorithmic stochasticity, and the influence of arbitrarily chosen parameter values. Furthermore, the results obtained typically provide only a simplistic view of the network partitioned into disjoint communities and provide no information of the relationship between communities. Here, we present methods to robustly detect co-regulated and functionally enriched gene communities and demonstrate their application and validity for Escherichia coli gene expression data. Applying a recently developed community detection algorithm to the network of interactions identified with the context likelihood of relatedness (CLR) method, we show that a hierarchy of network communities can be identified. These communities significantly enrich for gene ontology (GO) terms, consistent with them representing biologically meaningful groups. Further, analysis of the most significantly enriched communities identified several candidate new regulatory interactions. The robustness of our methods is demonstrated by showing that a core set of functional communities is reliably found when artificial noise, modeling experimental noise, is added to the data. We find that noise mainly acts conservatively, increasing the relatedness required for a network link to be reliably assigned and decreasing the size of the core communities, rather than causing association of genes into new communities.
确定生物网络的功能结构是系统生物学的一个核心目标。一种方法是分析基因表达数据,根据基因对环境和遗传扰动的相关反应来推断基因相互作用网络。然后可以分析推断的网络以识别功能社区。然而,由于实验噪声、算法随机性和任意选择的参数值的影响,常用的算法可能会产生不可靠的结果。此外,获得的结果通常仅提供网络分为不相交社区的简单视图,并且不提供社区之间关系的信息。在这里,我们提出了稳健检测共同调节和功能丰富的基因社区的方法,并展示了它们在大肠杆菌基因表达数据中的应用和有效性。我们应用最近开发的社区检测算法来检测使用相关关系的上下文似然 (CLR) 方法识别的相互作用网络,表明可以识别网络社区的层次结构。这些社区显著富含基因本体论 (GO) 术语,与其代表有生物学意义的群体一致。此外,对最显著富集的社区进行分析确定了几个候选新的调节相互作用。通过向数据中添加人工噪声(模拟实验噪声)来显示可靠地找到一组核心功能社区,证明了我们方法的稳健性。我们发现,噪声主要保守地起作用,增加了网络链接被可靠分配所需的相关性,并减小了核心社区的大小,而不是导致基因关联到新社区。