Akashi Hiroshi, Goel Piyush, John Anoop
Institute of Molecular Evolutionary Genetics, Department of Biology, Pennsylvania State University, State College, Pennsylvania, United States of America.
PLoS One. 2007 Oct 24;2(10):e1065. doi: 10.1371/journal.pone.0001065.
Reliable inference of ancestral sequences can be critical to identifying both patterns and causes of molecular evolution. Robustness of ancestral inference is often assumed among closely related species, but tests of this assumption have been limited. Here, we examine the performance of inference methods for data simulated under scenarios of codon bias evolution within the Drosophila melanogaster subgroup. Genome sequence data for multiple, closely related species within this subgroup make it an important system for studying molecular evolutionary genetics. The effects of asymmetric and lineage-specific substitution rates (i.e., varying levels of codon usage bias and departures from equilibrium) on the reliability of ancestral codon usage was investigated. Maximum parsimony inference, which has been widely employed in analyses of Drosophila codon bias evolution, was compared to an approach that attempts to account for uncertainty in ancestral inference by weighting ancestral reconstructions by their posterior probabilities. The latter approach employs maximum likelihood estimation of rate and base composition parameters. For equilibrium and most non-equilibrium scenarios that were investigated, the probabilistic method appears to generate reliable ancestral codon bias inferences for molecular evolutionary studies within the D. melanogaster subgroup. These reconstructions are more reliable than parsimony inference, especially when codon usage is strongly skewed. However, inference biases are considerable for both methods under particular departures from stationarity (i.e., when adaptive evolution is prevalent). Reliability of inference can be sensitive to branch lengths, asymmetry in substitution rates, and the locations and nature of lineage-specific processes within a gene tree. Inference reliability, even among closely related species, can be strongly affected by (potentially unknown) patterns of molecular evolution in lineages ancestral to those of interest.
可靠地推断祖先序列对于识别分子进化的模式和原因至关重要。在亲缘关系密切的物种中,通常假定祖先推断具有稳健性,但对这一假设的检验一直很有限。在这里,我们研究了在黑腹果蝇亚组内密码子偏好进化情景下模拟数据的推断方法的性能。该亚组内多个亲缘关系密切物种的基因组序列数据使其成为研究分子进化遗传学的一个重要系统。我们研究了不对称和谱系特异性替换率(即密码子使用偏好的不同水平和偏离平衡的情况)对祖先密码子使用可靠性的影响。将广泛应用于果蝇密码子偏好进化分析的最大简约推断法与一种试图通过根据后验概率对祖先重建进行加权来考虑祖先推断不确定性的方法进行了比较。后一种方法采用速率和碱基组成参数的最大似然估计。对于所研究的平衡和大多数非平衡情景,概率方法似乎能为黑腹果蝇亚组内的分子进化研究生成可靠的祖先密码子偏好推断。这些重建比简约推断更可靠,尤其是当密码子使用严重偏斜时。然而,在特定的非平稳情况下(即当适应性进化普遍存在时),两种方法的推断偏差都相当大。推断的可靠性可能对分支长度、替换率的不对称性以及基因树内谱系特异性过程的位置和性质敏感。即使在亲缘关系密切的物种中,推断可靠性也可能受到感兴趣谱系祖先中(可能未知的)分子进化模式的强烈影响。