应用多序列比对轮廓来改进蛋白质二级结构预测。

Application of multiple sequence alignment profiles to improve protein secondary structure prediction.

作者信息

Cuff J A, Barton G J

机构信息

Laboratory of Molecular Biophysics, Oxford, United Kingdom.

出版信息

Proteins. 2000 Aug 15;40(3):502-11. doi: 10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q.

DOI:10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q

PMID:10861942

Abstract

The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide a range of accuracy from 70.5% to 76.4%. The best accuracy of 76.4% (standard deviation 8.4%), is 3.1% (Q(3)) and 4.4% (SOV2) better than the PHD algorithm run on the same set of 406 sequence non-redundant proteins that were not used to train either method. Residues predicted by the new method with a confidence value of 5 or greater, have an average Q(3) accuracy of 84%, and cover 68% of the residues. Relative solvent accessibility based on a two state model, for 25, 5, and 0% accessibility are predicted at 76.2, 79.8, and 86. 6% accuracy respectively. The source of the improvements obtained from training with different representations of the same alignment data are described in detail. The new Jnet prediction method resulting from this study is available in the Jpred secondary structure prediction server, and as a stand-alone computer program from: http://barton.ebi.ac.uk/. Proteins 2000;40:502-511.

摘要

用源自相同序列的不同类型多序列比对图谱训练神经网络二级结构预测算法的效果表明，其准确率范围为70.5%至76.4%。76.4%的最佳准确率（标准差8.4%），比在同一组406个非冗余蛋白质序列上运行的PHD算法高出3.1%（Q(3)）和4.4%（SOV2），这组序列未用于训练这两种方法中的任何一种。新方法预测的置信值为5或更高的残基，平均Q(3)准确率为84%，覆盖68%的残基。基于二态模型，相对溶剂可及性在可及性为25%、5%和0%时的预测准确率分别为76.2%、79.8%和86.6%。详细描述了使用相同比对数据的不同表示形式进行训练所获得改进的来源。本研究产生的新Jnet预测方法可在Jpred二级结构预测服务器中获取，也可作为独立的计算机程序从以下网址获得：http://barton.ebi.ac.uk/。《蛋白质》2000年；40:502 - 511。