Zampieri Mattia, Soranzo Nicola, Bianchini Daniele, Altafini Claudio
SISSA-ISAS, International School for Advanced Studies, Trieste, Italy.
PLoS One. 2008 Aug 20;3(8):e2981. doi: 10.1371/journal.pone.0002981.
The concept of reverse engineering a gene network, i.e., of inferring a genome-wide graph of putative gene-gene interactions from compendia of high throughput microarray data has been extensively used in the last few years to deduce/integrate/validate various types of "physical" networks of interactions among genes or gene products.
This paper gives a comprehensive overview of which of these networks emerge significantly when reverse engineering large collections of gene expression data for two model organisms, E. coli and S. cerevisiae, without any prior information. For the first organism the pattern of co-expression is shown to reflect in fine detail both the operonal structure of the DNA and the regulatory effects exerted by the gene products when co-participating in a protein complex. For the second organism we find that direct transcriptional control (e.g., transcription factor-binding site interactions) has little statistical significance in comparison to the other regulatory mechanisms (such as co-sharing a protein complex, co-localization on a metabolic pathway or compartment), which are however resolved at a lower level of detail than in E. coli.
The gene co-expression patterns deduced from compendia of profiling experiments tend to unveil functional categories that are mainly associated to stable bindings rather than transient interactions. The inference power of this systematic analysis is substantially reduced when passing from E. coli to S. cerevisiae. This extensive analysis provides a way to describe the different complexity between the two organisms and discusses the critical limitations affecting this type of methodologies.
基因网络逆向工程的概念,即从高通量微阵列数据集中推断全基因组范围的假定基因-基因相互作用图,在过去几年中已被广泛用于推导/整合/验证基因或基因产物之间各种类型的“物理”相互作用网络。
本文全面概述了在没有任何先验信息的情况下,对两种模式生物大肠杆菌和酿酒酵母的大量基因表达数据进行逆向工程时,哪些网络会显著出现。对于第一种生物,共表达模式被证明能详细反映DNA的操纵子结构以及基因产物共同参与蛋白质复合物时所发挥的调控作用。对于第二种生物,我们发现与其他调控机制(如共同参与蛋白质复合物、在代谢途径或区室中共定位)相比,直接转录控制(如转录因子结合位点相互作用)的统计学意义不大,不过这些调控机制的解析细节程度低于大肠杆菌。
从分析实验数据集中推导出来的基因共表达模式往往揭示出主要与稳定结合而非瞬时相互作用相关的功能类别。从大肠杆菌到酿酒酵母,这种系统分析的推断能力大幅降低。这种广泛的分析提供了一种描述两种生物之间不同复杂性的方法,并讨论了影响此类方法的关键局限性。