Suppr超能文献

球状蛋白质中环的构象和几何形状预测:对ArchDB(一种环的结构分类)进行测试

Prediction of the conformation and geometry of loops in globular proteins: testing ArchDB, a structural classification of loops.

作者信息

Fernandez-Fuentes Narcis, Querol Enrique, Aviles Francesc X, Sternberg Michael J E, Oliva Baldomero

机构信息

Institute of Biomedicine and Biotechnology, Universitat Autonoma de Barcelona, Bellaterra, Barcelona, Spain.

出版信息

Proteins. 2005 Sep 1;60(4):746-57. doi: 10.1002/prot.20516.

Abstract

In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about phi-psi angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.

摘要

在蛋白质结构预测中,一个核心问题是确定连接两个二级结构的环的结构。这个问题在同源建模、折叠识别以及从头开始结构预测的多种策略中经常出现。在我们之前的工作中,我们开发了一个结构基序分类数据库ArchDB。该数据库包含451个结构类别的12,665个聚类环,其中有环中φ-ψ角的信息,以及1492个结构子类,包含支撑二级结构的相对位置信息。在此,我们评估环数据库中的序列信息可用于预测环结构的程度。使用了两种序列概况,一种是隐马尔可夫模型概况,另一种是源自PSI-BLAST的位置特异性打分矩阵(PSSM)。采用留一法测试,使用SCOP超家族定义去除同源环,然后针对仅考虑序列信息重新计算的概况进行预测。考虑了两种情况:(1)在比较建模中的结构类预测应用;(2)在折叠识别和从头开始预测中的结构子类预测应用。对于第一种情况,直接对具有X射线二级结构分配的环进行结构类预测,如果我们考虑451个可能类别中的前20个类别,预测的最佳准确率为78.5%。在第二种情况中,使用PSI-PRED(琼斯,《分子生物学杂志》1999年;292:195 - 202)二级结构预测来定义环边界,对环进行结构子类预测,如果我们考虑1492个中的前20个子类,最佳准确率为46.7%。还通过均方根偏差(RMSD)计算评估了环预测的准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验