Simmons Mark P, Reeves Aaron, Davis Jerrold I
Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
L.H. Bailey Hortorium, Department of Plant Biology, Cornell University, Ithaca, NY 14853, USA.
Cladistics. 2004 Apr;20(2):191-204. doi: 10.1111/j.1096-0031.2004.00014.x.
With only four alternative character states, parallelisms and reversals are expected to occur frequently when using nucleotide characters for phylogenetic inference. Greater available character-state space has been described as one of the advantages of third codon positions relative to first and second codon positions, as well as amino acids relative to nucleotides. We used simulations to quantify how character-state space and rate of evolution relate to one another, and how this relationship is affected by differences in: tree topology, branch lengths, rate heterogeneity among sites, probability of change among states, and frequency of character states. Specifically, we examined how inferred tree lengths, consistency and retention indices, and accuracy of phylogenetic inference are affected. Our results indicate that the relatively small increases in the character-state space evident in empirical data matrices can provide enormous benefits for the accuracy of phylogenetic inference. This advantage may become more pronounced with unequal probabilities of change among states. Although increased character-state space greatly improved the accuracy of topology inference, improvements in the estimation of the correct tree length were less apparent. Accuracy and inferred tree length improved most when character-state space increased initially; further increases provided more modest improvements.
由于只有四种替代字符状态,在使用核苷酸字符进行系统发育推断时,平行进化和反向进化预计会频繁发生。更大的可用字符状态空间被描述为第三密码子位置相对于第一和第二密码子位置的优势之一,以及氨基酸相对于核苷酸的优势之一。我们使用模拟来量化字符状态空间和进化速率如何相互关联,以及这种关系如何受到以下差异的影响:树的拓扑结构、分支长度、位点间的速率异质性、状态间变化的概率以及字符状态的频率。具体而言,我们研究了推断的树长、一致性和保留指数以及系统发育推断的准确性如何受到影响。我们的结果表明,经验数据矩阵中明显的字符状态空间相对较小的增加可为系统发育推断的准确性带来巨大益处。随着状态间变化概率的不平等,这种优势可能会更加明显。尽管增加字符状态空间极大地提高了拓扑推断的准确性,但对正确树长估计的改进不太明显。当字符状态空间最初增加时,准确性和推断的树长改善最为显著;进一步增加带来的改进则较为有限。