Dept, of Computer Science, Technion - Israel Institute of Technology, Haifa 32000, Israel.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S38. doi: 10.1186/1471-2105-11-S1-S38.
Pathways provide topical descriptions of cellular circuitry. Comparing analogous pathways reveals intricate insights into individual functional differences among species. While previous works in the field performed genomic comparisons and evolutionary studies that were based on specific genes or proteins, whole genomic sequence, or even single pathways, none of them described a genomic system level comparative analysis of metabolic pathways. In order to properly implement such an analysis one should overcome two specific challenges: how to combine the effect of many pathways under a unified framework and how to appropriately analyze co-evolution of pathways. Here we present a computational approach for solving these two challenges. First, we describe a comprehensive, scalable, information theory based computational pipeline that calculates pathway alignment information and then compiles it in a novel manner that allows further analysis. This approach can be used for building phylogenies and for pointing out specific differences that can then be analyzed in depth. Second, we describe a new approach for comparing the evolution of metabolic pathways. This approach can be used for detecting co-evolutionary relationships between metabolic pathways.
We demonstrate the advantages of our approach by applying our pipeline to data from the MetaCyc repository (which includes a total of 205 organisms and 660 metabolic pathways). Our analysis revealed several surprising biological observations. For example, we show that the different habitats in which Archaea organisms reside are reflected by a pathway based phylogeny. In addition, we discover two striking clusters of metabolic pathways, each cluster includes pathways that have very similar evolution.
We demonstrate that distance measures that are based on the topology and the content of metabolic networks are useful for studying evolution and co-evolution.
途径提供了细胞电路的主题描述。比较类似的途径揭示了物种之间个体功能差异的复杂见解。虽然该领域以前的工作基于特定基因或蛋白质、全基因组序列甚至单个途径进行了基因组比较和进化研究,但它们都没有描述代谢途径的基因组系统水平比较分析。为了正确实施这样的分析,应该克服两个特定的挑战:如何在统一框架下结合许多途径的效果,以及如何适当地分析途径的共同进化。在这里,我们提出了一种解决这两个挑战的计算方法。首先,我们描述了一种全面的、可扩展的、基于信息论的计算管道,该管道计算途径对齐信息,然后以新颖的方式对其进行编译,从而允许进一步分析。这种方法可用于构建系统发育树,并指出特定的差异,然后可以对其进行深入分析。其次,我们描述了一种比较代谢途径进化的新方法。这种方法可用于检测代谢途径之间的共同进化关系。
我们通过将我们的管道应用于 MetaCyc 存储库(其中包含总共 205 个生物体和 660 个代谢途径)中的数据来证明我们方法的优势。我们的分析揭示了一些令人惊讶的生物学观察结果。例如,我们表明,古细菌生物居住的不同栖息地反映在基于途径的系统发育树上。此外,我们发现了两个引人注目的代谢途径集群,每个集群都包括进化非常相似的途径。
我们证明基于代谢网络的拓扑和内容的距离度量对于研究进化和共同进化是有用的。