Kryshtafovych Andriy, Barbato Alessandro, Fidelis Krzysztof, Monastyrskyy Bohdan, Schwede Torsten, Tramontano Anna
Genome Center, University of California, Davis, 95616 California, USA.
Proteins. 2014 Feb;82 Suppl 2(0 2):112-26. doi: 10.1002/prot.24347. Epub 2013 Aug 31.
The article presents an assessment of the ability of the thirty-seven model quality assessment (MQA) methods participating in CASP10 to provide an a priori estimation of the quality of structural models, and of the 67 tertiary structure prediction groups to provide confidence estimates for their predicted coordinates. The assessment of MQA predictors is based on the methods used in previous CASPs, such as correlation between the predicted and observed quality of the models (both at the global and local levels), accuracy of methods in distinguishing between good and bad models as well as good and bad regions within them, and ability to identify the best models in the decoy sets. Several numerical evaluations were used in our analysis for the first time, such as comparison of global and local quality predictors with reference (baseline) predictors and a ROC analysis of the predictors' ability to differentiate between the well and poorly modeled regions. For the evaluation of the reliability of self-assessment of the coordinate errors, we used the correlation between the predicted and observed deviations of the coordinates and a ROC analysis of correctly identified errors in the models. A modified two-stage procedure for testing MQA methods in CASP10 whereby a small number of models spanning the whole range of model accuracy was released first followed by the release of a larger number of models of more uniform quality, allowed a more thorough analysis of abilities and inabilities of different types of methods. Clustering methods were shown to have an advantage over the single- and quasi-single- model methods on the larger datasets. At the same time, the evaluation revealed that the size of the dataset has smaller influence on the global quality assessment scores (for both clustering and nonclustering methods), than its diversity. Narrowing the quality range of the assessed models caused significant decrease in accuracy of ranking for global quality predictors but essentially did not change the results for local predictors. Self-assessment error estimates submitted by the majority of groups were poor overall, with two research groups showing significantly better results than the remaining ones.
本文对参与蛋白质结构预测关键评估第10轮(CASP10)的37种模型质量评估(MQA)方法给出结构模型质量先验估计的能力,以及67个三级结构预测团队对其预测坐标给出可信度估计的能力进行了评估。对MQA预测方法的评估基于以往CASP中使用的方法,如模型预测质量与观测质量之间的相关性(包括全局和局部层面)、区分好坏模型以及模型内部好坏区域的方法准确性,以及在诱饵集中识别最佳模型的能力。我们的分析首次使用了几种数值评估方法,如将全局和局部质量预测器与参考(基线)预测器进行比较,以及对预测器区分建模良好和较差区域能力的ROC分析。为评估坐标误差自我评估的可靠性,我们使用了坐标预测偏差与观测偏差之间的相关性以及对模型中正确识别误差的ROC分析。在CASP10中测试MQA方法的一种改进的两阶段程序,即先发布少量涵盖整个模型准确性范围的模型,随后发布大量质量更均匀的模型,使得能够更全面地分析不同类型方法的能力和不足。在更大的数据集上,聚类方法显示出优于单模型和准单模型方法的优势。同时,评估表明,数据集的大小对全局质量评估分数(对于聚类和非聚类方法)的影响小于其多样性。缩小评估模型的质量范围会导致全局质量预测器排名准确性显著下降,但基本不会改变局部预测器的结果。大多数团队提交的自我评估误差估计总体较差,有两个研究团队的结果明显优于其他团队。