Yanai Itai, DeLisi Charles
Bioinformatics Graduate Program and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.
Genome Biol. 2002 Oct 25;3(11):research0064. doi: 10.1186/gb-2002-3-11-research0064.
Comparative genomics provides at least three methods beyond traditional sequence similarity for identifying functional links between genes: the examination of common phylogenetic distributions, the analysis of conserved proximity along the chromosomes of multiple genomes, and observations of fusions of genes into a multidomain gene in another organism. We have previously generated the links according to each of these methods individually for 43 known microbial genomes. Here we combine these results to construct networks of functional associations.
We show that the functional networks obtained by applying these methods have different topologies and that the information they provide is largely additive. In particular, the combined networks of functional links contain an average of 57% of an organism's complete genetic complement, uncover substantial portions of known pathways, and suggest the function of previously unannotated genes. In addition, the combined networks are qualitatively different from the networks obtained using individual methods. They have a dominant cluster that contains approximately 80%-90% of the genes, independent of genome size, and the dominant clusters show the small world behavior expected of a biological system, with global connectivity that is nearly random, and local properties that are highly ordered.
When the information on functional linkage provided by three emerging computational methods is combined, the integrated network uncovers large numbers of conserved pathways and identifies clusters of functionally related genes. It therefore shows considerable utility and promise as a tool for understanding genomic structure, and for guiding high throughput experimental investigations.
比较基因组学提供了至少三种超越传统序列相似性的方法来识别基因之间的功能联系:检查共同的系统发育分布、分析多个基因组染色体上的保守邻近性以及观察基因在另一种生物体中融合成多结构域基因的情况。我们之前已经分别根据这些方法中的每一种为43个已知的微生物基因组生成了联系。在这里,我们将这些结果结合起来构建功能关联网络。
我们表明,应用这些方法获得的功能网络具有不同的拓扑结构,并且它们提供的信息在很大程度上是累加的。特别是,功能联系的组合网络平均包含生物体完整遗传互补的57%,揭示了已知途径的很大一部分,并暗示了先前未注释基因的功能。此外,组合网络在性质上与使用单个方法获得的网络不同。它们有一个占主导地位的聚类,包含大约80%-90%的基因,与基因组大小无关,并且主导聚类表现出生物系统预期的小世界行为,具有几乎随机的全局连通性和高度有序的局部特性。
当将三种新兴计算方法提供的功能联系信息结合起来时,整合后的网络揭示了大量保守途径,并识别出功能相关基因的聚类。因此,它作为一种理解基因组结构和指导高通量实验研究的工具,显示出相当大的实用性和前景。