Clemente José C, Satou Kenji, Valiente Gabriel
School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi Ishikawa 923-1292, Japan.
Bioinformatics. 2007 Jan 15;23(2):e110-5. doi: 10.1093/bioinformatics/btl307.
Recent results related to horizontal gene transfer suggest that phylogenetic reconstruction cannot be determined conclusively from sequence data, resulting in a shift from approaches based on polymorphism information in DNA or protein sequence to studies aimed at understanding the evolution of complete biological processes. The increasing amount of available information on metabolic pathways for several species makes it of greater relevance to understand the similarities and differences among such pathways. These similarities can then be used to infer phylogenetic trees not based exclusively in sequence data, therefore avoiding the previously mentioned problems.
In this article, we present a method to assess the structural similarity of metabolic pathways for several organisms. Our algorithms work by using one of the three possible enzyme similarity measures (hierarchical, information content, gene ontology), and one of the two clustering methods (neighbor-joining, unweighted pair group method with arithmetic mean), to produce a phylogenetic tree both in Newick and graphic format. The web server implementing our algorithms is optimized to answer queries in linear time.
The software is available for free public use on a web server, at the address http://www.jaist.ac.jp/~clemente/cgi-bin/phylo.pl. It is available on demand in source code form for research use to educational institutions, non-profit research institutes, government research laboratories and individuals, for non-exclusive use, without the right of the licensee to further redistribute the source code.
近期与水平基因转移相关的研究结果表明,仅依据序列数据无法最终确定系统发育重建,这导致研究方式从基于DNA或蛋白质序列中的多态性信息转向旨在理解完整生物过程进化的研究。几种物种代谢途径的可用信息日益增多,使得了解这些途径之间的异同变得更具现实意义。这些相似性随后可用于推断并非完全基于序列数据的系统发育树,从而避免上述问题。
在本文中,我们提出了一种评估几种生物体代谢途径结构相似性的方法。我们的算法通过使用三种可能的酶相似性度量方法之一(层次法、信息含量法、基因本体法)以及两种聚类方法之一(邻接法、算术平均非加权对组法)来工作,以生成Newick格式和图形格式的系统发育树。实现我们算法的网络服务器经过优化,可在线性时间内回答查询。
该软件可在网络服务器上免费供公众使用,网址为http://www.jaist.ac.jp/~clemente/cgi-bin/phylo.pl。它以源代码形式按需提供给教育机构、非营利性研究机构、政府研究实验室和个人用于研究,仅供非独家使用,被许可方无权进一步重新分发源代码。