Centro de Biologia Molecular "Severo Ochoa" CSIC-UAM Cantoblanco, 28049 Madrid, Spain.
Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, UK.
Syst Biol. 2019 Nov 1;68(6):987-1002. doi: 10.1093/sysbio/syz022.
The molecular clock hypothesis, which states that substitutions accumulate in protein sequences at a constant rate, plays a fundamental role in molecular evolution but it is violated when selective or mutational processes vary with time. Such violations of the molecular clock have been widely investigated for protein sequences, but not yet for protein structures. Here, we introduce a novel statistical test (Significant Clock Violations) and perform a large scale assessment of the molecular clock in the evolution of both protein sequences and structures in three large superfamilies. After validating our method with computer simulations, we find that clock violations are generally consistent in sequence and structure evolution, but they tend to be larger and more significant in structure evolution. Moreover, changes of function assessed through Gene Ontology and InterPro terms are associated with large and significant clock violations in structure evolution. We found that almost one third of significant clock violations are significant in structure evolution but not in sequence evolution, highlighting the advantage to use structure information for assessing accelerated evolution and gathering hints of positive selection. Clock violations between closely related pairs are frequently significant in sequence evolution, consistent with the observed time dependence of the substitution rate attributed to segregation of neutral and slightly deleterious polymorphisms, but not in structure evolution, suggesting that these substitutions do not affect protein structure although they may affect stability. These results are consistent with the view that natural selection, both negative and positive, constrains more strongly protein structures than protein sequences. Our code for computing clock violations is freely available at https://github.com/ugobas/Molecular_clock.
分子钟假说指出,蛋白质序列中的替换以恒定速率积累,在分子进化中起着至关重要的作用,但当选择或突变过程随时间变化时,该假说就会被违反。这种对分子钟的违反在蛋白质序列中已经得到了广泛的研究,但在蛋白质结构中尚未得到研究。在这里,我们引入了一种新的统计检验(显著钟违规),并在三个大型超家族中对蛋白质序列和结构的进化中的分子钟进行了大规模评估。在用计算机模拟验证了我们的方法后,我们发现,在序列和结构进化中,钟违规通常是一致的,但在结构进化中,它们往往更大且更显著。此外,通过基因本体论和 InterPro 术语评估的功能变化与结构进化中的大而显著的钟违规有关。我们发现,几乎三分之一的显著钟违规在结构进化中显著,但在序列进化中不显著,这突出了利用结构信息评估加速进化并收集正选择线索的优势。在序列进化中,密切相关的对之间的钟违规经常是显著的,这与观察到的替代率随时间的变化一致,这归因于中性和轻微有害多态性的分离,但在结构进化中并非如此,这表明这些替换虽然可能影响稳定性,但不会影响蛋白质结构。这些结果与以下观点一致,即自然选择(包括负选择和正选择)比蛋白质序列更强烈地约束蛋白质结构。我们用于计算钟违规的代码可在 https://github.com/ugobas/Molecular_clock 上免费获得。