Department of Systems Biology, Center for Biological Sequence Analysis (CBS), Technical University of Denmark, Lyngby, Denmark.
PLoS One. 2010 Nov 30;5(11):e15079. doi: 10.1371/journal.pone.0015079.
β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively.
The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
β-转角是最常见的非重复结构类型,平均构成蛋白质中氨基酸的 25%。β-转角的形成在蛋白质折叠、蛋白质稳定性和分子识别过程中起着重要作用。在这项工作中,我们提出了神经网络方法 NetTurnP,用于通过使用进化信息和预测的蛋白质序列特征来预测两类β-转角和预测各个β-转角类型。它已针对常用数据集 BT426 进行了评估,其 Matthews 相关系数为 0.50,这是在两类β-转角和非-β-转角预测中报告的最高性能。此外,NetTurnP 在某些特定的β-转角类型上显示出了改进的性能。在本工作中,已经从原始氨基酸序列训练神经网络方法来预测β-转角或非β-转角以及各个β-转角类型。根据 PROMOTIF 的分类,预测了 I、I'、II、II'、VIII、VIa1、VIa2、VIba 和 IV 等个别β-转角类型,而β-转角或非β-转角的两类别预测是包含所有β-转角类型的超集。使用称为 BT426 的同源序列的黄金集来评估性能。我们的两类别预测方法的性能为:MCC=0.50、Qtotal=82.1%、敏感性=75.6%、PPV=68.8%和 AUC=0.864。我们将我们的性能与其他 11 种获得 0.17-0.47 范围内的 Matthews 相关系数的预测方法进行了比较。对于特定类型的β-转角预测,只有类型 I 和 II 可以用合理的 Matthews 相关系数进行预测,我们分别获得了 0.36 和 0.31 的性能值。
NetTurnP 方法已被实现为一个网络服务器,可在 http://www.cbs.dtu.dk/services/NetTurnP/ 上免费获得。NetTurnP 是唯一可用的允许提交多个序列的网络服务器。