Simmons Mark P, Carr Timothy G, O'Neill Kevin
Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
Mol Phylogenet Evol. 2004 Sep;32(3):913-26. doi: 10.1016/j.ympev.2004.04.011.
We examined a broad selection of protein-coding loci from a diverse array of clades and genomes to quantify three factors that determine whether nucleotide or amino acid characters should be preferred for phylogenetic inference. First, we quantified the difference in observed character-state space between nucleotides and amino acids. Second, we quantified the loss of potential phylogenetic signal from silent substitutions when amino acids are used. Third, we used the disparity index to quantify the relative compositional heterogeneity of nucleotides and amino acids and then determined how commonly convergent (rather than unique) shifts in nucleotide and amino acid composition occur in a phylogenetic context. The greater potential phylogenetic signal for nucleotide characters was found to be enormous (on average 440% that of amino acids), whereas the greater observed character-state space for amino acids was less impressive (on average 150.4% that of nucleotides). While matrices of amino acid sequences had less compositional heterogeneity than their corresponding nucleotide sequences, heterogeneity in amino acid composition may be more homoplasious than heterogeneity in nucleotide composition. Given the ability of increased taxon sampling to better utilize the greater potential phylogenetic signal of nucleotide characters and decrease the potential for artifacts caused by heterogeneous nucleotide composition among taxa, we suggest that increased taxon sampling be performed whenever possible instead of restricting analyses to amino acid characters.
我们从各种各样的进化枝和基因组中广泛选取了蛋白质编码位点,以量化决定系统发育推断应优先选择核苷酸还是氨基酸特征的三个因素。首先,我们量化了核苷酸和氨基酸之间观察到的特征状态空间的差异。其次,我们量化了使用氨基酸时沉默替换导致的潜在系统发育信号的损失。第三,我们使用离散指数来量化核苷酸和氨基酸的相对组成异质性,然后确定在系统发育背景下核苷酸和氨基酸组成的趋同(而非独特)变化有多常见。结果发现,核苷酸特征的潜在系统发育信号更强(平均是氨基酸的440%),而氨基酸观察到的更大特征状态空间则没那么显著(平均是核苷酸的150.4%)。虽然氨基酸序列矩阵的组成异质性低于其相应的核苷酸序列,但氨基酸组成的异质性可能比核苷酸组成的异质性更易发生同塑现象。鉴于增加分类群抽样能够更好地利用核苷酸特征更强的潜在系统发育信号,并降低分类群间核苷酸组成异质性导致的假象的可能性,我们建议尽可能增加分类群抽样,而不是将分析局限于氨基酸特征。