Suppr超能文献

蛋白质二级结构预测的多序列方法评估与改进

Evaluation and improvement of multiple sequence methods for protein secondary structure prediction.

作者信息

Cuff J A, Barton G J

机构信息

Laboratory of Molecular Biophysics, Oxford, United Kingdom.

出版信息

Proteins. 1999 Mar 1;34(4):508-19. doi: 10.1002/(sici)1097-0134(19990301)34:4<508::aid-prot10>3.0.co;2-4.

Abstract

A new dataset of 396 protein domains is developed and used to evaluate the performance of the protein secondary structure prediction algorithms DSC, PHD, NNSSP, and PREDATOR. The maximum theoretical Q3 accuracy for combination of these methods is shown to be 78%. A simple consensus prediction on the 396 domains, with automatically generated multiple sequence alignments gives an average Q3 prediction accuracy of 72.9%. This is a 1% improvement over PHD, which was the best single method evaluated. Segment Overlap Accuracy (SOV) is 75.4% for the consensus method on the 396-protein set. The secondary structure definition method DSSP defines 8 states, but these are reduced by most authors to 3 for prediction. Application of the different published 8- to 3-state reduction methods shows variation of over 3% on apparent prediction accuracy. This suggests that care should be taken to compare methods by the same reduction method. Two new sequence datasets (CB513 and CB251) are derived which are suitable for cross-validation of secondary structure prediction methods without artifacts due to internal homology. A fully automatic World Wide Web service that predicts protein secondary structure by a combination of methods is available via http://barton.ebi.ac.uk/.

摘要

开发了一个包含396个蛋白质结构域的新数据集,并用于评估蛋白质二级结构预测算法DSC、PHD、NNSSP和PREDATOR的性能。这些方法组合的最大理论Q3准确率显示为78%。对396个结构域进行简单的一致性预测,并自动生成多序列比对,得到的平均Q3预测准确率为72.9%。这比评估的最佳单一方法PHD提高了1%。对于396个蛋白质集的一致性方法,片段重叠准确率(SOV)为75.4%。二级结构定义方法DSSP定义了8种状态,但大多数作者将其简化为3种用于预测。应用不同的已发表的8到3状态简化方法,表观预测准确率的变化超过3%。这表明在通过相同的简化方法比较方法时应谨慎。推导了两个新的序列数据集(CB513和CB251),它们适用于蛋白质二级结构预测方法的交叉验证,且不存在由于内部同源性导致的假象。通过http://barton.ebi.ac.uk/ 可获得一个通过方法组合预测蛋白质二级结构的全自动万维网服务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验