Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
Donnelly Centre for Cellular and Biomolecular Research, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
Hum Mutat. 2019 Sep;40(9):1414-1423. doi: 10.1002/humu.23852. Epub 2019 Aug 7.
Predicting the impact of mutations on proteins remains an important problem. As part of the CAGI5 frataxin challenge, we evaluate the accuracy with which Provean, FoldX, and ELASPIC can predict changes in the Gibbs free energy of a protein using a limited data set of eight mutations. We find that different methods have distinct strengths and limitations, with no method being strictly superior to other methods on all metrics. ELASPIC achieves the highest accuracy while also providing a web interface which simplifies the evaluation and analysis of mutations. FoldX is slightly less accurate than ELASPIC but is easier to run locally, as it does not depend on external tools or datasets. Provean achieves reasonable results while being computational less expensive than the other methods and not requiring a structure of the protein. In addition to methods submitted to the CAGI5 community experiment, and with the aim to inform about other methods with high accuracy, we also evaluate predictions made by Rosetta's ddg_monomer protocol, Rosetta's cartesian_ddg protocol, and thermodynamic integration calculations using Amber package. ELASPIC still achieves the highest accuracy, while Rosetta's catesian_ddg protocol appears to perform best in capturing the overall trend in the data.
预测突变对蛋白质的影响仍然是一个重要的问题。作为 CAGI5 弗里德赖希共济失调蛋白挑战的一部分,我们评估了 Provean、FoldX 和 ELASPIC 在使用有限的八个突变数据集预测蛋白质吉布斯自由能变化的准确性。我们发现不同的方法有不同的优缺点,没有一种方法在所有指标上都严格优于其他方法。ELASPIC 实现了最高的准确性,同时还提供了一个简化突变评估和分析的网络界面。FoldX 的准确性略低于 ELASPIC,但更容易在本地运行,因为它不依赖于外部工具或数据集。Provean 虽然计算成本低于其他方法,并且不需要蛋白质结构,但仍能获得合理的结果。除了提交给 CAGI5 社区实验的方法外,为了告知其他具有高精度的方法,我们还评估了 Rosetta 的 ddg_monomer 协议、Rosetta 的 cartesian_ddg 协议以及使用 Amber 包进行的热力学积分计算的预测。ELASPIC 仍然实现了最高的准确性,而 Rosetta 的 cartesian_ddg 协议似乎在捕捉数据的整体趋势方面表现最好。