Kunin Victor, Ouzounis Christos A
Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge CB10 1SD, UK.
Bioinformatics. 2003 Jul 22;19(11):1412-6. doi: 10.1093/bioinformatics/btg174.
While current computational methods allow the reconstruction of individual ancestral protein sequences, reconstruction of complete gene content of ancestral species is not yet an established task. In this paper, we describe GENETRACE, an efficient linear-time algorithm that allows the reconstruction of evolutionary history of individual protein families as well as the complete gene content of ancestral species. The performance of the method was validated with a simulated evolution program called SimulEv. Our results indicate that given a set of correct phylogenetic profiles and a correct species tree, ancestral gene content can be reconstructed with sensitivity and selectivity of more than 90%. SimulEv simulations were also used to evaluate performance of the reconstruction of gene content-based phylogenetic trees, suggesting that these trees may be accurate at the terminal branches but suffer from long branch attraction near the root of the tree.
虽然当前的计算方法能够重建个体祖先蛋白质序列,但重建祖先物种的完整基因内容尚未成为一项既定任务。在本文中,我们描述了GENETRACE,这是一种高效的线性时间算法,它能够重建单个蛋白质家族的进化历史以及祖先物种的完整基因内容。该方法的性能通过一个名为SimulEv的模拟进化程序进行了验证。我们的结果表明,给定一组正确的系统发育谱和一个正确的物种树,祖先基因内容能够以超过90%的灵敏度和选择性进行重建。SimulEv模拟还用于评估基于基因内容的系统发育树的重建性能,这表明这些树在末端分支可能是准确的,但在树根附近存在长枝吸引问题。