Parthiban Vijaya, Gromiha M Michael, Hoppe Christian, Schomburg Dietmar
Cologne University Bioinformatics Center, International Max Planck Research School, Cologne, Germany.
Proteins. 2007 Jan 1;66(1):41-52. doi: 10.1002/prot.21115.
Analyzing the factors behind protein stability is a key research topic in molecular biology, and has direct implications on protein structure prediction and protein-protein interactions. We have analyzed protein stability upon point mutations using a distance-dependant pair potential representing mainly through-space interactions, and torsion angle potential representing mainly neighboring effects as a basic statistical mechanical setup for the analysis. The synergetic effect of accessible surface area and secondary structure preferences was used as a classifier for the potentials. In addition, short-, medium-, and long-range interactions of the protein environment were also analyzed. Two datasets of point mutations were taken for the comparison of theoretically predicted stabilizing energy values with experimental DeltaDeltaG and DeltaDeltaGH(2)O from thermal and chemical denaturation experiments. These include 1538 and 1603 mutations, respectively, and contain 101 proteins that share a wide range of sequence identity. The resulting force fields were carefully evaluated with different statistical tests. Results show a maximum correlation of 0.87 with a standard error of 0.71 kcal/mol between predicted and measured DeltaDeltaG values and a prediction accuracy of 85.3% (stabilizing or destabilizing) for all mutations together. A correlation of 0.77 (more than 80% prediction accuracy with a standard error of 0.95 kcal/mol) each for the test dataset of split-sample validation and fivefold crossvalidation was obtained and a correlation of 0.70 (77.4% prediction accuracy with a standard error of 1.17 kcal/mol) was shown by the jackknife test. The same model was implemented, and the results were analyzed for mutations with DeltaDeltaGH(2)O. A correlation of 0.78 (standard error 0.96 kcal/mol) was observed with a prediction efficiency of 84.65%. This model can be used for the future prediction of protein structural stability together with various experimental techniques.
分析蛋白质稳定性背后的因素是分子生物学中的一个关键研究课题,并且对蛋白质结构预测和蛋白质-蛋白质相互作用有直接影响。我们使用主要代表空间相互作用的距离依赖对势和主要代表相邻效应的扭转角势,作为分析的基本统计力学设置,来分析点突变后的蛋白质稳定性。可及表面积和二级结构偏好的协同效应被用作势的分类器。此外,还分析了蛋白质环境的短程、中程和长程相互作用。采用了两个点突变数据集,将理论预测的稳定能值与热变性和化学变性实验中的实验ΔΔG和ΔΔGH₂O进行比较。这些数据集分别包含1538个和1603个突变,并且包含101种具有广泛序列同一性的蛋白质。用不同的统计检验仔细评估了所得的力场。结果表明,预测的和测量的ΔΔG值之间的最大相关性为0.87,标准误差为0.71千卡/摩尔,所有突变的预测准确率为85.3%(稳定或不稳定)。对于留一法检验,分割样本验证和五重交叉验证的测试数据集的相关性分别为0.77(预测准确率超过80%,标准误差为0.95千卡/摩尔),刀切法检验显示相关性为0.70(预测准确率为77.4%,标准误差为1.17千卡/摩尔)。实施了相同的模型,并分析了ΔΔGH₂O突变的结果。观察到相关性为0.78(标准误差0.96千卡/摩尔),预测效率为84.65%。该模型可与各种实验技术一起用于未来蛋白质结构稳定性的预测。