Mazurie Aurélien, Bonchev Danail, Schwikowski Benno, Buck Gregory A
Systems Biology Group, Institut Pasteur, 25 rue du Docteur Roux, 75015 Paris, France.
Bioinformatics. 2008 Nov 15;24(22):2579-85. doi: 10.1093/bioinformatics/btn503. Epub 2008 Sep 26.
Although metabolic reactions are unquestionably shaped by evolutionary processes, the degree to which the overall structure and complexity of their interconnections are linked to the phylogeny of species has not been evaluated in depth. Here, we apply an original metabolome representation, termed Network of Interacting Pathways or NIP, with a combination of graph theoretical and machine learning strategies, to address this question. NIPs compress the information of the metabolic network exhibited by a species into much smaller networks of overlapping metabolic pathways, where nodes are pathways and links are the metabolites they exchange.
Our analysis shows that a small set of descriptors of the structure and complexity of the NIPs combined into regression models reproduce very accurately reference phylogenetic distances derived from 16S rRNA sequences (10-fold cross-validation correlation coefficient higher than 0.9). Our method also showed better scores than previous work on metabolism-based phylogenetic reconstructions, as assessed by branch distances score, topological similarity and second cousins score. Thus, our metabolome representation as network of overlapping metabolic pathways captures sufficient information about the underlying evolutionary events leading to the formation of metabolic networks and species phylogeny. It is important to note that precise knowledge of all of the reactions in these pathways is not required for these reconstructions. These observations underscore the potential for the use of abstract, modular representations of metabolic reactions as tools in studying the evolution of species.
Supplementary data are available at Bioinformatics online.
尽管代谢反应无疑受到进化过程的影响,但它们相互连接的整体结构和复杂性与物种系统发育的关联程度尚未得到深入评估。在此,我们应用一种原创的代谢组表示方法,称为相互作用途径网络(Network of Interacting Pathways,简称NIP),结合图论和机器学习策略来解决这个问题。NIP将一个物种所展示的代谢网络信息压缩到由重叠代谢途径组成的小得多的网络中,其中节点是途径,边是它们交换的代谢物。
我们的分析表明,将NIP的结构和复杂性的一小部分描述符组合到回归模型中,可以非常准确地重现从16S rRNA序列得出的参考系统发育距离(10倍交叉验证相关系数高于0.9)。通过分支距离得分、拓扑相似性和二级表亲得分评估,我们的方法在基于代谢的系统发育重建方面也比以前的工作表现更好。因此,我们将代谢组表示为重叠代谢途径网络,捕捉到了足够的关于导致代谢网络形成和物种系统发育的潜在进化事件的信息。需要注意的是,这些重建并不需要精确了解这些途径中的所有反应。这些观察结果强调了使用代谢反应的抽象模块化表示作为研究物种进化工具的潜力。
补充数据可在《生物信息学》在线获取。