Computational Biology and Bioinformatics, Université Libre de Bruxelles, CP 165/61, Roosevelt Ave. 50, Brussels, 1050, Belgium.
Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, Brussels, 1050, Belgium.
BMC Biol. 2020 Oct 20;18(1):146. doi: 10.1186/s12915-020-00870-9.
How, and the extent to which, evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability is a long-standing open question in the field of molecular evolution. We addressed this issue through the first structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures, as well as through available experimental stability and fitness data.
At the amino acid level, we found the protein surface to be more robust against random mutations than the core, this difference being stronger for small proteins. The destabilizing and neutral mutations are more numerous in the core and on the surface, respectively, whereas the stabilizing mutations are about 4% in both regions. At the genetic code level, we observed smallest destabilization for mutations that are due to substitutions of base III in the codon, followed by base I, bases I+III, base II, and other multiple base substitutions. This ranking highly anticorrelates with the codon-anticodon mispairing frequency in the translation process. This suggests that the standard genetic code is optimized to limit the impact of random mutations, but even more so to limit translation errors. At the codon level, both the codon usage and the usage bias appear to optimize mutational robustness and translation accuracy, especially for surface residues.
Our results highlight the non-universality of mutational robustness and its multiscale dependence on protein features, the structure of the genetic code, and the codon usage. Our analyses and approach are strongly supported by available experimental mutagenesis data.
进化如何作用于 DNA 和蛋白质序列,以确保突变的稳健性和可进化性,这是分子进化领域长期存在的开放性问题。我们通过首次结构组规模的计算研究来解决这个问题,在该研究中,我们估计了在 20,000 多个蛋白质结构中引入的所有可能的单点突变对折叠自由能的变化,以及通过可用的实验稳定性和适应性数据。
在氨基酸水平上,我们发现蛋白质表面比核心更能抵抗随机突变,这种差异在小蛋白质中更强。在核心和表面上,分别有更多的不稳定和中性突变,而稳定突变在这两个区域都约占 4%。在遗传密码水平上,我们观察到由于密码子中碱基 III 的取代而导致的突变的最小去稳定化,其次是碱基 I、碱基 I+III、碱基 II 和其他多个碱基取代。这种排序与翻译过程中密码子-反密码子错配频率高度相关。这表明标准遗传密码是优化的,以限制随机突变的影响,但更重要的是限制翻译错误。在密码子水平上,密码子使用和使用偏好似乎都优化了突变的稳健性和翻译的准确性,特别是对于表面残基。
我们的结果强调了突变稳健性的非普遍性及其对蛋白质特征、遗传密码结构和密码子使用的多尺度依赖性。我们的分析和方法得到了可用的实验诱变数据的强有力支持。