Salamov A A, Solovyev V V
Department of Cell Biology, Baylor College of Medicine, Houston, TX 77030, USA.
J Mol Biol. 1997 Apr 25;268(1):31-6. doi: 10.1006/jmbi.1997.0958.
The accuracy of secondary structure prediction methods has been improved significantly by the use of aligned protein sequences. The PHD method and the NNSSP method reach 71 to 72% of sustained overall three-state accuracy when multiple sequence alignments are with neural networks and nearest-neighbor algorithms, respectively. We introduce a variant of the nearest-neighbor approach that can achieve similar accuracy using a single sequence as the query input. We compute the 50 best non-intersecting local alignments of the query sequence with each sequence from a set of proteins with known 3D structures. Each position of the query sequence is aligned with the database amino acids in alpha-helical, beta-strand or coil states. The prediction type of secondary structure is selected as the type of aligned position with the maximal total score. On the dataset of 124 non-membrane non-homologous proteins, used earlier as a benchmark for secondary structure predictions, our method reaches an overall three-state accuracy of 71.2%. The performance accuracy is verified by an additional test on 461 non-homologous proteins giving an accuracy of 71.0%. The main strength of the method is the high level of prediction accuracy for proteins without any known homolog. Using multiple sequence alignments as input the method has a prediction accuracy of 73.5%. Prediction of secondary structure by the SSPAL method is available via Baylor College of Medicine World Wide Web server.
通过使用比对后的蛋白质序列,二级结构预测方法的准确性得到了显著提高。当分别使用神经网络和最近邻算法进行多序列比对时,PHD方法和NNSSP方法的持续总体三态准确率分别达到71%至72%。我们引入了一种最近邻方法的变体,该变体可以使用单个序列作为查询输入来实现类似的准确率。我们计算查询序列与一组具有已知三维结构的蛋白质中的每个序列的50个最佳不相交局部比对。查询序列的每个位置与处于α螺旋、β链或卷曲状态的数据库氨基酸进行比对。二级结构的预测类型被选为具有最大总分的比对位置的类型。在124个非膜非同源蛋白质的数据集上(该数据集早期用作二级结构预测的基准),我们的方法达到了71.2%的总体三态准确率。通过对461个非同源蛋白质的额外测试验证了性能准确率,其准确率为71.0%。该方法的主要优势在于对没有任何已知同源物的蛋白质具有较高水平的预测准确率。使用多序列比对作为输入时,该方法的预测准确率为73.5%。通过贝勒医学院万维网服务器可以使用SSPAL方法进行二级结构预测。