Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA.
Department of Biochemistry and Molecular Genetics, Computational Bioscience Program, University of Colorado Denver, Aurora, CO, USA.
J Mol Evol. 2020 Dec;88(10):720-730. doi: 10.1007/s00239-020-09969-7. Epub 2020 Oct 29.
Heterotachy-the change in sequence evolutionary rate over time-is a common feature of protein molecular evolution. Decades of studies have shed light on the conditions under which heterotachy occurs, and there is evidence that site-specific evolutionary rate shifts are correlated with changes in protein function. Here, we present a large-scale, computational analysis using thousands of protein sequence alignments from animal and plant proteomes, representing genes related either by orthology (speciation events) or paralogy (gene duplication), to compare sequence divergence patterns in orthologous vs. paralogous sequence alignments. We use sequence-based phylogenetic analyses to infer overall sequence divergence (tree length/number of sequences) and to fit site-specific rates to a discrete gamma distribution with a shape parameter α. This inference method is applied to real protein sequence alignments, as well as alignments simulated under various models of protein sequence evolution. Our simulations indicate that sequence divergence and the α parameter are positively correlated when sequences evolve with heterotachy, meaning that inferred site rate distributions appear more uniform as sequences diverge. Divergence and α are also positively correlated in both orthologous and paralogous genes, but the average increase in α (as a function of divergence) is significantly higher in paralogous protein alignments than in orthologous alignments. This result is consistent with the widely held view that recently duplicated proteins initially evolve under relaxed selective pressure, promoting functional divergence by accumulation of amino acid replacements, and hence experience more evolutionary rate fluctuations than orthologous proteins. We discuss these findings in the context of the ortholog conjecture, a long-standing assumption in molecular evolution, which posits that protein sequences related by orthology tend to be more functionally conserved than paralogous proteins.
异时进化——序列进化率随时间的变化——是蛋白质分子进化的一个共同特征。几十年来的研究揭示了异时进化发生的条件,有证据表明,特定部位的进化率变化与蛋白质功能的变化有关。在这里,我们利用来自动物和植物蛋白质组的数千个蛋白质序列比对进行了大规模的计算分析,这些比对代表了通过同源(物种形成事件)或旁系同源(基因复制)相关的基因,以比较同源和旁系同源序列比对中的序列分歧模式。我们使用基于序列的系统发育分析来推断总体序列分歧(树长/序列数),并将特定部位的速率拟合到具有形状参数α的离散伽马分布中。这种推断方法适用于真实的蛋白质序列比对,以及在各种蛋白质序列进化模型下模拟的比对。我们的模拟表明,当序列随异时进化时,序列分歧和α参数呈正相关,这意味着推断出的位点速率分布在序列分歧时显得更加均匀。在同源和旁系同源基因中,分歧和α也是正相关的,但在旁系同源蛋白质比对中,α 的平均增加(作为分歧的函数)明显高于同源比对。这一结果与广泛持有的观点一致,即最近复制的蛋白质最初在选择压力放松的情况下进化,通过积累氨基酸替换促进功能分歧,因此与同源蛋白质相比经历更多的进化率波动。我们在同源物假说的背景下讨论了这些发现,同源物假说 是分子进化中一个长期存在的假设,该假设认为通过同源关系相关的蛋白质序列比旁系同源蛋白更倾向于具有功能保守性。