Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.
Department of Life Science, University of Science, Pyongyang, DPR Korea.
Proteins. 2019 Dec;87(12):1361-1377. doi: 10.1002/prot.25767. Epub 2019 Jul 16.
Methods to reliably estimate the accuracy of 3D models of proteins are both a fundamental part of most protein folding pipelines and important for reliable identification of the best models when multiple pipelines are used. Here, we describe the progress made from CASP12 to CASP13 in the field of estimation of model accuracy (EMA) as seen from the progress of the most successful methods in CASP13. We show small but clear progress, that is, several methods perform better than the best methods from CASP12 when tested on CASP13 EMA targets. Some progress is driven by applying deep learning and residue-residue contacts to model accuracy prediction. We show that the best EMA methods select better models than the best servers in CASP13, but that there exists a great potential to improve this further. Also, according to the evaluation criteria based on local similarities, such as lDDT and CAD, it is now clear that single model accuracy methods perform relatively better than consensus-based methods.
可靠估计蛋白质 3D 模型准确性的方法是大多数蛋白质折叠流水线的基本组成部分,当使用多个流水线时,对于可靠识别最佳模型也很重要。在这里,我们将描述从 CASP12 到 CASP13 在模型准确性估计(EMA)领域的进展,从 CASP13 中最成功的方法的进展中可以看到这一点。我们展示了微小但明显的进展,即在 CASP13 EMA 目标上测试时,几种方法的性能优于 CASP12 中最好的方法。一些进展是通过将深度学习和残基残基接触应用于模型准确性预测来驱动的。我们表明,最好的 EMA 方法选择的模型比 CASP13 中的最佳服务器更好,但进一步提高这一点仍有很大的潜力。此外,根据基于局部相似性的评估标准,如 lDDT 和 CAD,现在很明显,单个模型准确性方法的性能相对优于基于共识的方法。