Green M L, Karp P D
Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA.
Nucleic Acids Res. 2006 Aug 7;34(13):3687-97. doi: 10.1093/nar/gkl438. Print 2006.
Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.
不同的通路数据库使用不同的生物学通路概念。这些通路本体对通路计算有显著影响。通路数据库的计算用户根据他们所使用数据库的通路本体,会得到不同的结果,并且不同的通路本体适用于不同的最终用途。我们通过比较BioCyc和KEGG本体来探究通路本体的差异。BioCyc本体将通路定义为单个生物体代谢网络中一个保守的原子模块,即通常作为一个单元进行调控,其边界由高连接性的稳定代谢物界定。KEGG通路平均比BioCyc通路大4.2倍,并且它将来自不同生物体的多个生物学过程组合起来,形成一个以底物为中心的反应镶嵌图。我们使用基因组上下文方法比较了KEGG和BioCyc通路,该方法可确定基因对之间的功能相关性。对于我们采用的每种方法,从BioCyc通路中随机选择的一对基因比从KEGG通路中随机选择的一对基因更有可能通过该方法产生关联,这支持了以下结论:与KEGG相比,BioCyc通路概念更接近单个保守的生物学过程。