Department of Human Genetics, University of California, Los Angeles, Los Angeles, California, United States of America.
PLoS Comput Biol. 2011 Jan 20;7(1):e1001057. doi: 10.1371/journal.pcbi.1001057.
In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation.
在许多应用中,人们感兴趣的是确定网络模块的哪些属性在不同条件下发生变化。例如,为了验证模块的存在,期望在独立的测试网络中显示它是可重现的(或保留的)。在这里,我们研究了几种不需要在测试网络中进行模块分配的网络保留统计信息。我们根据基础网络的类型来区分网络保留统计信息。有些保留统计信息是为一般网络(由邻接矩阵定义)定义的,而另一些则仅为相关网络(基于数值变量之间的两两相关关系构建)定义。我们的应用表明,相关结构有助于定义特别强大的模块保留统计信息。我们说明评估模块保留通常与评估聚类保留不同。我们发现将多个保留统计信息聚合为汇总保留统计信息是有利的。我们在六个基因共表达网络应用中说明了这些方法的使用,包括 1)在鼠组织中保留胆固醇生物合成途径,2)比较人类和黑猩猩大脑网络,3)在人类和黑猩猩大脑网络之间保留选定的 KEGG 途径,4)人类皮质网络中的性别差异,5)在鼠标肝脏网络中的性别差异。虽然我们没有发现人类皮质网络中存在性别特异性模块的证据,但我们发现几个人类皮质模块在黑猩猩中保留程度较低。特别是,凋亡基因在人类和黑猩猩之间的共表达存在差异。我们的模拟研究和应用表明,模块保留统计信息可用于研究网络模块化结构之间的差异。数据、R 软件和配套教程可以从以下网页下载:http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation。