Suppr超能文献

基于氨基酸组成预测蛋白质折叠类型的蒙特卡罗模拟研究。

Monte Carlo simulation studies on the prediction of protein folding types from amino acid composition.

作者信息

Zhang C T, Chou K C

机构信息

Computational Chemistry, Upjohn Research Laboratories, Kalamazoo, Michigan 49001.

出版信息

Biophys J. 1992 Dec;63(6):1523-9. doi: 10.1016/S0006-3495(92)81728-9.

Abstract

In the methodology development for statistical prediction of protein structures, the founders of different methods usually selected different sets of proteins to test their predicted results. Therefore, it is hard to make a fair comparison according to the results they reported. Even if the predictions by different methods are performed for the same set of proteins, there is still such a problem: a method better that the other for one set of proteins would not necessarily remain so when applied to another set of proteins. To tackle this problem, a Monte Carlo simulation method is proposed to establish an objective criterion to measure the accuracy of prediction for the protein folding type. Such an objective accuracy is actually corresponding to the asymptotical limit genereated during the Monte Carlo simulation process. Based on that, it has been found that the average objective accuracy for predicting the all-alpha, all-beta, alpha + beta, and alpha/beta proteins by the least Euclid's distance method (Nakashima, H., K. Nishikawa, and T. Ooi. 1986. J. Biochem. 99:152-162) is 73.0% and that by the least Minkowski's distance method (Chou, P.Y. 1989. Prediction in Protein Structure and the Principles of Protein Conformation. Plenum Press. New York. 549-586) is 70.9%, indicating that the former is better than the latter. However, according to the original reports, the latter claimed a rate of correct prediction with 79.7% but the former with only 70.2%, leading to a completely opposite conclusion. This indicates the necessity of establishing an objective criterion, and a comparison is meaningful only when it is based on the objective criterion. The simulation method and the idea developed here also can be applied to examine any other statistical prediction methods.

摘要

在蛋白质结构统计预测的方法开发中,不同方法的创始人通常选择不同的蛋白质集来测试他们的预测结果。因此,很难根据他们报告的结果进行公平比较。即使对同一组蛋白质进行不同方法的预测,仍然存在这样一个问题:一种方法在一组蛋白质上比另一种方法更好,但应用于另一组蛋白质时不一定仍然如此。为了解决这个问题,提出了一种蒙特卡罗模拟方法,以建立一个客观标准来衡量蛋白质折叠类型预测的准确性。这种客观准确性实际上对应于蒙特卡罗模拟过程中产生的渐近极限。基于此,发现用最小欧几里得距离法(中岛宏、西川健、大井敏郎,1986年,《生物化学杂志》99卷:152 - 162页)预测全α、全β、α + β和α/β蛋白质的平均客观准确率为73.0%,用最小闵可夫斯基距离法(周培源,1989年,《蛋白质结构预测与蛋白质构象原理》,普伦纽姆出版社,纽约,第549 - 586页)为70.9%,表明前者优于后者。然而,根据原始报告,后者声称正确预测率为79.7%,而前者仅为70.2%,得出了完全相反的结论。这表明建立客观标准的必要性,只有基于客观标准的比较才是有意义的。这里开发的模拟方法和思想也可用于检验任何其他统计预测方法。

相似文献

6
Improving protein structure prediction with model-based search.利用基于模型的搜索改进蛋白质结构预测。
Bioinformatics. 2005 Jun;21 Suppl 1:i66-74. doi: 10.1093/bioinformatics/bti1029.
9
Folding simulations of small proteins.小蛋白质的折叠模拟
Biophys Chem. 2005 Apr 1;115(2-3):195-200. doi: 10.1016/j.bpc.2004.12.040. Epub 2005 Jan 6.

引用本文的文献

8
Studies on the specificity of HIV protease: an application of Markov chain theory.
J Protein Chem. 1993 Dec;12(6):709-24. doi: 10.1007/BF01024929.

本文引用的文献

2
Prediction of protein structural class by discriminant analysis.通过判别分析预测蛋白质结构类别。
Biochim Biophys Acta. 1986 Nov 21;874(2):205-15. doi: 10.1016/0167-4838(86)90119-6.
4
Codon usage tabulated from the GenBank genetic sequence data.根据GenBank基因序列数据制成的密码子使用表。
Nucleic Acids Res. 1990 Apr 25;18 Suppl(Suppl):2367-411. doi: 10.1093/nar/18.suppl.2367.
5
Structural patterns in globular proteins.球状蛋白质中的结构模式。
Nature. 1976 Jun 17;261(5561):552-8. doi: 10.1038/261552a0.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验