使用蛋白质和DNA序列对几种系统发育方法的准确性进行比较。

Comparison of the accuracies of several phylogenetic methods using protein and DNA sequences.

作者信息

Hall Barry G

机构信息

Biology Department, University of Rochester, USA.

出版信息

Mol Biol Evol. 2005 Mar;22(3):792-802. doi: 10.1093/molbev/msi066. Epub 2004 Dec 8.

Abstract

A biologically realistic method was used to simulate evolutionary trees. The method uses a real DNA coding sequence as the starting point, simulates mutation according to the mutational spectrum of Escherichia coli-including base substitutions, insertions, and deletions-and separates the processes of mutation and selection. Trees of 8, 16, 32, and 64 taxa were simulated with average branch lengths of 50, 100, 150, 200, and 250 changes per branch. The resulting sequences were aligned with ClustalX, and trees were estimated by Neighbor Joining, Parsimony, Maximum Likelihood, and Bayesian methods from both DNA sequences and the corresponding protein sequences. The estimated trees were compared with the true trees, and both topological and branch length accuracies were scored. Over the variety of conditions tested, Bayesian trees estimated from DNA sequences that had been aligned according to the alignment of the corresponding protein sequences were the most accurate, followed by Maximum Likelihood trees estimated from DNA sequences and Parsimony trees estimated from protein sequences.

摘要

一种生物学上逼真的方法被用于模拟进化树。该方法使用真实的DNA编码序列作为起点,根据大肠杆菌的突变谱(包括碱基替换、插入和缺失)模拟突变,并分离突变和选择过程。模拟了8、16、32和64个分类单元的树,每个分支的平均分支长度为50、100、150、200和250个变化。将所得序列用ClustalX进行比对,并通过邻接法、简约法、最大似然法和贝叶斯法从DNA序列和相应的蛋白质序列中估计树。将估计的树与真实的树进行比较,并对拓扑结构和分支长度的准确性进行评分。在所测试的各种条件下,根据相应蛋白质序列的比对进行比对的DNA序列估计的贝叶斯树最准确,其次是DNA序列估计的最大似然树和蛋白质序列估计的简约树。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索