Suppr超能文献

用于预测蛋白质结构模型错误的综合评分。

A composite score for predicting errors in protein structure models.

作者信息

Eramian David, Shen Min-yi, Devos Damien, Melo Francisco, Sali Andrej, Marti-Renom Marc A

机构信息

Graduate Group in Biophysics, Department of Biopharmaceutical Sciences, University of California at San Francisco 94158, USA.

出版信息

Protein Sci. 2006 Jul;15(7):1653-66. doi: 10.1110/ps.062095806. Epub 2006 Jun 2.

Abstract

Reliable prediction of model accuracy is an important unsolved problem in protein structure modeling. To address this problem, we studied 24 individual assessment scores, including physics-based energy functions, statistical potentials, and machine learning-based scoring functions. Individual scores were also used to construct approximately 85,000 composite scoring functions using support vector machine (SVM) regression. The scores were tested for their abilities to identify the most native-like models from a set of 6000 comparative models of 20 representative protein structures. Each of the 20 targets was modeled using a template of <30% sequence identity, corresponding to challenging comparative modeling cases. The best SVM score outperformed all individual scores by decreasing the average RMSD difference between the model identified as the best of the set and the model with the lowest RMSD (DeltaRMSD) from 0.63 A to 0.45 A, while having a higher Pearson correlation coefficient to RMSD (r=0.87) than any other tested score. The most accurate score is based on a combination of the DOPE non-hydrogen atom statistical potential; surface, contact, and combined statistical potentials from MODPIPE; and two PSIPRED/DSSP scores. It was implemented in the SVMod program, which can now be applied to select the final model in various modeling problems, including fold assignment, target-template alignment, and loop modeling.

摘要

在蛋白质结构建模中,可靠地预测模型准确性是一个重要的未解决问题。为了解决这个问题,我们研究了24种个体评估分数,包括基于物理的能量函数、统计势和基于机器学习的评分函数。还使用个体分数通过支持向量机(SVM)回归构建了约85,000种复合评分函数。测试了这些分数从20个代表性蛋白质结构的6000个比较模型中识别最接近天然结构模型的能力。20个目标中的每一个都使用序列同一性小于30%的模板进行建模,这对应于具有挑战性的比较建模情况。最佳的支持向量机分数通过将被确定为该组中最佳模型与具有最低均方根偏差(RMSD)的模型之间的平均RMSD差异(DeltaRMSD)从0.63 Å降低到0.45 Å,同时与RMSD的皮尔逊相关系数(r = 0.87)高于任何其他测试分数,从而优于所有个体分数。最准确的分数基于DOPE非氢原子统计势、MODPIPE的表面、接触和组合统计势以及两个PSIPRED/DSSP分数的组合。它在SVMod程序中实现,现在可应用于在各种建模问题中选择最终模型,包括折叠分配、目标-模板比对和环建模。

相似文献

4
Protein secondary structure prediction with SPARROW.利用 SPARROW 进行蛋白质二级结构预测。
J Chem Inf Model. 2012 Feb 27;52(2):545-56. doi: 10.1021/ci200321u. Epub 2012 Jan 23.
10
Fold recognition by predicted alignment accuracy.通过预测比对准确性进行折叠识别。
IEEE/ACM Trans Comput Biol Bioinform. 2005 Apr-Jun;2(2):157-65. doi: 10.1109/TCBB.2005.24.

引用本文的文献

2
Spike-Independent Infection of Human Coronavirus 229E in Bat Cells.人冠状病毒 229E 在蝙蝠细胞中的非依赖性感染。
Microbiol Spectr. 2023 Jun 15;11(3):e0348322. doi: 10.1128/spectrum.03483-22. Epub 2023 May 18.
3
Uncovering cryptic pockets in the SARS-CoV-2 spike glycoprotein.揭示 SARS-CoV-2 刺突糖蛋白中的隐匿口袋。
Structure. 2022 Aug 4;30(8):1062-1074.e4. doi: 10.1016/j.str.2022.05.006. Epub 2022 Jun 3.
6
Current Approaches in Supersecondary Structures Investigation.当前超二级结构研究方法。
Int J Mol Sci. 2021 Nov 2;22(21):11879. doi: 10.3390/ijms222111879.
8
Site-Specific Steric Control of SARS-CoV-2 Spike Glycosylation.SARS-CoV-2 刺突糖基化的位点特异性空间位阻控制。
Biochemistry. 2021 Jul 13;60(27):2153-2169. doi: 10.1021/acs.biochem.1c00279. Epub 2021 Jul 2.

本文引用的文献

2
The victor/FRST function for model quality estimation.用于模型质量评估的胜利者/FRST函数。
J Comput Biol. 2005 Dec;12(10):1316-27. doi: 10.1089/cmb.2005.12.1316.
4
Practical lessons from protein structure prediction.蛋白质结构预测的实践经验。
Nucleic Acids Res. 2005 Apr 1;33(6):1874-91. doi: 10.1093/nar/gki327. Print 2005.
8
The Universal Protein Resource (UniProt).通用蛋白质资源(UniProt)。
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D154-9. doi: 10.1093/nar/gki070.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验