Bergmann Sven, Ihmels Jan, Barkai Naama
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
PLoS Biol. 2004 Jan;2(1):E9. doi: 10.1371/journal.pbio.0020009. Epub 2003 Dec 15.
Comparing genomic properties of different organisms is of fundamental importance in the study of biological and evolutionary principles. Although differences among organisms are often attributed to differential gene expression, genome-wide comparative analysis thus far has been based primarily on genomic sequence information. We present a comparative study of large datasets of expression profiles from six evolutionarily distant organisms: S. cerevisiae, C. elegans, E. coli, A. thaliana, D. melanogaster, and H. sapiens. We use genomic sequence information to connect these data and compare global and modular properties of the transcription programs. Linking genes whose expression profiles are similar, we find that for all organisms the connectivity distribution follows a power-law, highly connected genes tend to be essential and conserved, and the expression program is highly modular. We reveal the modular structure by decomposing each set of expression data into coexpressed modules. Functionally related sets of genes are frequently coexpressed in multiple organisms. Yet their relative importance to the transcription program and their regulatory relationships vary among organisms. Our results demonstrate the potential of combining sequence and expression data for improving functional gene annotation and expanding our understanding of how gene expression and diversity evolved.
比较不同生物体的基因组特性在生物学和进化原理的研究中具有至关重要的意义。尽管生物体之间的差异通常归因于基因表达的差异,但迄今为止,全基因组比较分析主要基于基因组序列信息。我们对来自六种进化距离较远的生物体(酿酒酵母、秀丽隐杆线虫、大肠杆菌、拟南芥、黑腹果蝇和智人)的大量表达谱数据集进行了比较研究。我们利用基因组序列信息来关联这些数据,并比较转录程序的全局和模块特性。通过将表达谱相似的基因联系起来,我们发现对于所有生物体,连接性分布遵循幂律,高度连接的基因往往是必需的且保守的,并且表达程序具有高度模块化。我们通过将每组表达数据分解为共表达模块来揭示模块化结构。功能相关的基因集在多种生物体中经常共表达。然而,它们对转录程序的相对重要性及其调控关系在不同生物体之间有所不同。我们的结果证明了结合序列和表达数据在改善功能基因注释以及扩展我们对基因表达和多样性如何进化的理解方面的潜力。