Department of Chemistry, Seoul National University, Seoul, Republic of Korea.
Genome Center, University of California, Davis, California.
Proteins. 2019 Dec;87(12):1351-1360. doi: 10.1002/prot.25804. Epub 2019 Aug 30.
Scoring model structure is an essential component of protein structure prediction that can affect the prediction accuracy tremendously. Users of protein structure prediction results also need to score models to select the best models for their application studies. In Critical Assessment of techniques for protein Structure Prediction (CASP), model accuracy estimation methods have been tested in a blind fashion by providing models submitted by the tertiary structure prediction servers for scoring. In CASP13, model accuracy estimation results were evaluated in terms of both global and local structure accuracy. Global structure accuracy estimation was evaluated by the quality of the models selected by the global structure scores and by the absolute estimates of the global scores. Residue-wise, local structure accuracy estimations were evaluated by three different measures. A new measure introduced in CASP13 evaluates the ability to predict inaccurately modeled regions that may be improved by refinement. An intensive comparative analysis on CASP13 and the previous CASPs revealed that the tertiary structure models generated by the CASP13 servers show very distinct features. Higher consensus toward models of higher global accuracy appeared even for free modeling targets, and many models of high global accuracy were not well optimized at the atomic level. This is related to the new technology in CASP13, deep learning for tertiary contact prediction. The tertiary model structures generated by deep learning pose a new challenge for EMA (estimation of model accuracy) method developers. Model accuracy estimation itself is also an area where deep learning can potentially have an impact, although current EMA methods have not fully explored that direction.
评分模型结构是蛋白质结构预测的重要组成部分,它可以极大地影响预测的准确性。蛋白质结构预测结果的使用者也需要对模型进行评分,以选择最适合其应用研究的模型。在蛋白质结构预测技术的关键评估 (Critical Assessment of techniques for protein Structure Prediction, CASP) 中,通过提供由三级结构预测服务器提交的模型进行评分,以盲法测试模型准确性估计方法。在 CASP13 中,从全局和局部结构准确性两个方面评估模型准确性估计结果。全局结构准确性估计通过全局结构得分选择的模型的质量和全局得分的绝对估计来评估。在残基水平上,通过三种不同的方法评估局部结构准确性估计。CASP13 中引入的一种新方法评估了预测不准确建模区域的能力,这些区域可以通过细化来改进。对 CASP13 和之前的 CASP 的深入比较分析表明,CASP13 服务器生成的三级结构模型具有非常明显的特征。即使对于免费建模目标,更高的全局准确性模型的共识度也更高,许多全局准确性较高的模型在原子水平上并没有得到很好的优化。这与 CASP13 中的新技术,即三级接触预测的深度学习有关。深度学习生成的三级模型结构给 EMA(模型准确性估计)方法开发者带来了新的挑战。模型准确性估计本身也是深度学习可能产生影响的一个领域,尽管目前的 EMA 方法尚未充分探索这一方向。