Suppr超能文献

比较蛋白质结构模型的准确性能被预测到什么程度?

How well can the accuracy of comparative protein structure models be predicted?

作者信息

Eramian David, Eswar Narayanan, Shen Min-Yi, Sali Andrej

机构信息

Graduate Group in Biophysics, University of California at San Francisco, California 94158, USA.

出版信息

Protein Sci. 2008 Nov;17(11):1881-93. doi: 10.1110/ps.036061.108. Epub 2008 Oct 1.

Abstract

Comparative structure models are available for two orders of magnitude more protein sequences than are experimentally determined structures. These models, however, suffer from two limitations that experimentally determined structures do not: They frequently contain significant errors, and their accuracy cannot be readily assessed. We have addressed the latter limitation by developing a protocol optimized specifically for predicting the Calpha root-mean-squared deviation (RMSD) and native overlap (NO3.5A) errors of a model in the absence of its native structure. In contrast to most traditional assessment scores that merely predict one model is more accurate than others, this approach quantifies the error in an absolute sense, thus helping to determine whether or not the model is suitable for intended applications. The assessment relies on a model-specific scoring function constructed by a support vector machine. This regression optimizes the weights of up to nine features, including various sequence similarity measures and statistical potentials, extracted from a tailored training set of models unique to the model being assessed: If possible, we use similarly sized models with the same fold; otherwise, we use similarly sized models with the same secondary structure composition. This protocol predicts the RMSD and NO3.5A errors for a diverse set of 580,317 comparative models of 6174 sequences with correlation coefficients (r) of 0.84 and 0.86, respectively, to the actual errors. This scoring function achieves the best correlation compared to 13 other tested assessment criteria that achieved correlations ranging from 0.35 to 0.71.

摘要

与通过实验确定的蛋白质结构相比,比较结构模型可用于多两个数量级的蛋白质序列。然而,这些模型存在两个实验确定的结构所没有的局限性:它们经常包含重大错误,并且其准确性难以轻易评估。我们通过开发一种专门优化的方案来解决后一个局限性,该方案用于在没有天然结构的情况下预测模型的Cα均方根偏差(RMSD)和天然重叠(NO3.5A)误差。与大多数传统评估分数仅仅预测一个模型比其他模型更准确不同,这种方法从绝对意义上量化误差,从而有助于确定该模型是否适用于预期应用。该评估依赖于由支持向量机构建的特定于模型的评分函数。这种回归优化了多达九个特征的权重,这些特征包括从针对被评估模型的定制训练模型集中提取的各种序列相似性度量和统计势:如果可能,我们使用具有相同折叠的大小相似的模型;否则,我们使用具有相同二级结构组成的大小相似的模型。该方案预测了6174个序列的580317个不同比较模型的RMSD和NO3.5A误差,与实际误差的相关系数(r)分别为0.84和0.86。与其他13个测试评估标准(相关系数范围为0.35至0.71)相比,该评分函数实现了最佳相关性。

相似文献

5
Sub-AQUA: real-value quality assessment of protein structure models.Sub-AQUA:蛋白质结构模型的实值质量评估。
Protein Eng Des Sel. 2010 Aug;23(8):617-32. doi: 10.1093/protein/gzq030. Epub 2010 Jun 4.

引用本文的文献

3
Structural coverage of the human interactome.人类相互作用组的结构覆盖。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad496.
6
The evolution of the HIV-1 protease folding stability.HIV-1蛋白酶折叠稳定性的演变
Virus Evol. 2022 Dec 5;8(2):veac115. doi: 10.1093/ve/veac115. eCollection 2022.

本文引用的文献

2
Comparative protein structure modeling using MODELLER.使用MODELLER进行比较蛋白质结构建模。
Curr Protoc Protein Sci. 2007 Nov;Chapter 2:Unit 2.9. doi: 10.1002/0471140864.ps0209s50.
3
The ModFOLD server for the quality assessment of protein structural models.用于蛋白质结构模型质量评估的ModFOLD服务器。
Bioinformatics. 2008 Feb 15;24(4):586-7. doi: 10.1093/bioinformatics/btn014. Epub 2008 Jan 9.
6
Fold assessment for comparative protein structure modeling.用于比较蛋白质结构建模的折叠评估
Protein Sci. 2007 Nov;16(11):2412-26. doi: 10.1110/ps.072895107. Epub 2007 Sep 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验