Department of Biology, Dalhousie University, Halifax, NS, B3H 4J1, Canada.
J Mol Evol. 2013 Apr;76(4):205-15. doi: 10.1007/s00239-013-9549-0. Epub 2013 Feb 27.
Functional shifts during protein evolution are expected to yield shifts in substitution rate, and statistical methods can test for this at both codon and amino acid levels. Although methods based on models of sequence evolution serve as powerful tools for studying evolutionary processes, violating underlying assumptions can lead to false biological conclusions. It is not unusual for functional shifts to be accompanied by changes in other aspects of the evolutionary process, such as codon or amino acid frequencies. However, models used to test for functional divergence assume these frequencies remain constant over time. We employed simulation to investigate the impact of non-stationary evolution on functional divergence inference. We investigated three likelihood ratio tests based on codon models and found varying degrees of sensitivity. Joint effects of shifts in frequencies and selection pressures can be large, leading to false signals for positive selection. Amino acid-based tests (FunDi and Bivar) were also compromised when several aspects of the substitution process were not adequately modeled. We applied the same tests to a core genome "scan" for functional divergence between light-adapted ecotypes of the cyanobacteria Prochlorococcus, and carried out gene-specific simulations for ten genes. Results of those simulations illustrated how the inference of functional divergence at the genomic level can be seriously impacted by model misspecification. Although computationally costly, simulations motivated by data in hand are warranted when several aspects of the substitution process are either misspecified or not included in the models upon which the statistical tests were built.
在蛋白质进化过程中,功能的转变预计会导致取代率的转变,统计方法可以在密码子和氨基酸水平上对此进行检验。虽然基于序列进化模型的方法是研究进化过程的有力工具,但违反基本假设可能会导致错误的生物学结论。在进化过程的其他方面(如密码子或氨基酸频率)发生变化的同时,功能转变并不罕见。然而,用于检验功能分歧的模型假设这些频率随时间保持不变。我们通过模拟研究了非平稳进化对功能分歧推断的影响。我们研究了基于密码子模型的三种似然比检验,发现它们的敏感性不同。频率和选择压力变化的共同影响可能很大,导致对正选择的虚假信号。当替代过程的几个方面没有得到充分建模时,基于氨基酸的测试(FunDi 和 Bivar)也受到了影响。我们将相同的测试应用于蓝藻聚球藻的光适应生态型之间的核心基因组“扫描”,以进行十个基因的基因特异性模拟。这些模拟的结果说明了在替代过程的几个方面被错误指定或未包含在构建统计检验所依据的模型中的情况下,如何严重影响功能分歧的基因组水平推断。虽然计算成本很高,但当替代过程的几个方面被错误指定或未包含在构建统计检验所依据的模型中时,基于手头数据的模拟是合理的。