Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA.
Proteins. 2023 Dec;91(12):1889-1902. doi: 10.1002/prot.26542. Epub 2023 Jun 26.
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA.
评估蛋白质复合物和组装体的四级结构模型的准确性(EMA)对于预测四级结构并将其应用于研究蛋白质功能和相互作用非常重要。结构模型之间的成对相似性已被证明可用于估计蛋白质三级结构模型的质量,但很少应用于预测四级结构模型的质量。此外,当许多结构模型质量较低且彼此相似时,成对相似性方法往往会失败。为了解决这一差距,我们开发了一种混合方法(MULTICOM_qa),该方法结合了基于深度学习的链间接触预测的成对相似性得分(PSS)和界面接触概率得分(ICPS),用于估计蛋白质复合物模型的准确性。它在 2022 年第 15 届蛋白质结构预测技术评估(CASP15)中盲目参与,并在评估组装模型的全局结构准确性方面表现出色。MULTICOM_qa 预测的模型质量得分与 CASP15 组装目标模型的真实质量得分之间的平均每个目标相关系数为 0.66。使用预测质量得分对模型进行排名的平均每个目标排名损失为 0.14。它能够为大多数目标选择良好的模型。此外,还确定并分析了几个关键因素(即目标难度、模型采样难度、模型质量的偏度以及良好/不良模型之间的相似性)对 EMA 的影响。结果表明,将多模型方法(PSS)与补充的单模型方法(ICPS)相结合是 EMA 的一种很有前途的方法。