Department of Computer Science, University of Miami, Coral Gables, Florida, USA.
Proteins. 2022 Dec;90(12):2091-2102. doi: 10.1002/prot.26400. Epub 2022 Jul 30.
The estimation of protein model accuracy (EMA) or model quality assessment (QA) is important for protein structure prediction. An accurate EMA algorithm can guide the refinement of models or pick the best model or best parts of models from a pool of predicted tertiary structures. We developed two novel methods: MASS2 and LAW, for predicting residue-specific or local qualities of individual models, which incorporate residual neural networks and graph neural networks, respectively. These two methods use similar features extracted from protein models but different architectures of neural networks to predict the local accuracies of single models. MASS2 and LAW participated in the QA category of CASP14, and according to our evaluations based on CASP14 official criteria, MASS2 and LAW are the best and second-best methods based on the Z-scores of ASE/100, AUC, and ULR-1.F1. We also evaluated MASS2, LAW, and the residue-specific predicted deviations (between model and native structure) generated by AlphaFold2 on CASP14 AlphaFold2 tertiary structure (TS) models. LAW achieved comparable or better performances compared to the predicted deviations generated by AlphaFold2 on AlphaFold2 TS models, even though LAW was not trained on any AlphaFold2 TS models. Specifically, LAW performed better on AUC and ULR scores, and AlphaFold2 performed better on ASE scores. This means that AlphaFold2 is better at predicting deviations, but LAW is better at classifying accurate and inaccurate residues and detecting unreliable local regions. MASS2 and LAW can be freely accessed from http://dna.cs.miami.edu/MASS2-CASP14/ and http://dna.cs.miami.edu/LAW-CASP14/, respectively.
蛋白质模型准确性估计(EMA)或模型质量评估(QA)对于蛋白质结构预测很重要。准确的 EMA 算法可以指导模型的细化,或者从预测的三级结构池中选择最佳模型或模型的最佳部分。我们开发了两种新方法:MASS2 和 LAW,用于预测单个模型的残基特异性或局部质量,分别结合了残差神经网络和图神经网络。这两种方法使用从蛋白质模型中提取的相似特征,但使用不同的神经网络架构来预测单个模型的局部准确性。MASS2 和 LAW 参加了 CASP14 的 QA 类别,根据我们基于 CASP14 官方标准的评估,MASS2 和 LAW 是基于 ASE/100、AUC 和 ULR-1.F1 的 Z 分数的最佳和第二佳方法。我们还评估了 MASS2、LAW 和 AlphaFold2 生成的残基特异性预测偏差(模型与天然结构之间)在 CASP14 AlphaFold2 三级结构(TS)模型上的表现。与 AlphaFold2 TS 模型上生成的由 AlphaFold2 预测的偏差相比,LAW 实现了相当或更好的性能,尽管 LAW 没有在任何 AlphaFold2 TS 模型上进行训练。具体来说,LAW 在 AUC 和 ULR 分数上表现更好,而 AlphaFold2 在 ASE 分数上表现更好。这意味着 AlphaFold2 更擅长预测偏差,但 LAW 更擅长对准确和不准确的残基进行分类,并检测不可靠的局部区域。MASS2 和 LAW 可以分别从 http://dna.cs.miami.edu/MASS2-CASP14/ 和 http://dna.cs.miami.edu/LAW-CASP14/ 免费访问。