Kepp Kasper P
DTU Chemistry, Technical University of Denmark , DK 2800 Kongens Lyngby, Denmark.
J Phys Chem B. 2014 Feb 20;118(7):1799-812. doi: 10.1021/jp4119138. Epub 2014 Feb 6.
Protein stability is affected in several diseases and is of substantial interest in efforts to correlate genotypes to phenotypes. Superoxide dismutase 1 (SOD1) is a suitable test case for such correlations due to its abundance, stability, available crystal structures and thermochemical data, and physiological importance. In this work, stability changes of SOD1 mutations were computed with five methods, CUPSAT, I-Mutant2.0, I-Mutant3.0, PoPMuSiC, and SDM, with emphasis on structural sensitivity as a potential issue in structure-based protein calculation. The large correlation between experimental literature data of SOD1 dimers and monomers (r = 0.82) suggests that mutations in separate protein monomers are mostly additive. PoPMuSiC was most accurate (typical MAE ~ 1 kcal/mol, r ~ 0.5). The relative performance of the methods was not very structure-dependent, and the more accurate methods also displayed less structural sensitivity, with the standard deviation from different high-resolution structures down to ~0.2 kcal/mol. Structures of variable resolution and number of protein copies locally affected specific sites, emphasizing the use of state-relevant crystal structures when such sites are of interest, but had little impact on overall batch estimates. Protein-interaction effects (as a mimic of crystal packing) were small for the more accurate methods. Thus, batch computations, relevant to, e.g., comparisons of disease/nondisease mutant sets or different clades in phylogenetic trees, are much more significant than single mutant calculations and may be the only meaningful way to computationally bridge the genotype-phenotype gap of proteomics. Finally, mutations involving glycine were most difficult to model, of relevance to future method improvement. This could be due to structure changes (glycine has a low structural propensity) or water colocalization with glycine.
蛋白质稳定性在多种疾病中会受到影响,并且在将基因型与表型相关联的研究中具有重大意义。超氧化物歧化酶1(SOD1)因其丰度、稳定性、可用的晶体结构和热化学数据以及生理重要性,是进行此类关联研究的合适测试对象。在这项工作中,使用CUPSAT、I-Mutant2.0、I-Mutant3.0、PoPMuSiC和SDM这五种方法计算了SOD1突变体的稳定性变化,重点关注结构敏感性这一基于结构的蛋白质计算中的潜在问题。SOD1二聚体和单体的实验文献数据之间的高度相关性(r = 0.82)表明,单独蛋白质单体中的突变大多具有加和性。PoPMuSiC最为准确(典型平均绝对误差约为1千卡/摩尔,r约为0.5)。这些方法的相对性能并非非常依赖于结构,更准确的方法也表现出较低的结构敏感性,不同高分辨率结构的标准偏差低至约0.2千卡/摩尔。可变分辨率和蛋白质拷贝数的结构会局部影响特定位点,这强调了在关注此类位点时使用与状态相关的晶体结构,但对整体批量估计影响不大。对于更准确的方法,蛋白质相互作用效应(作为晶体堆积的模拟)较小。因此,与例如疾病/非疾病突变体集比较或系统发育树中不同进化枝比较相关的批量计算比单个突变体计算重要得多,并且可能是在计算上弥合蛋白质组学基因型-表型差距的唯一有意义的方法。最后,涉及甘氨酸的突变最难建模,这与未来方法的改进相关。这可能是由于结构变化(甘氨酸具有低结构倾向)或甘氨酸与水的共定位。