Wang Yan, Xue Zhidong, Xu Jin
Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan City, China.
Proteins. 2006 Oct 1;65(1):49-54. doi: 10.1002/prot.21062.
We have developed a novel method named AlphaTurn to predict alpha-turns in proteins based on the support vector machine (SVM). The prediction was done on a data set of 469 nonhomologous proteins containing 967 alpha-turns. A great improvement in prediction performance was achieved by using multiple sequence alignment generated by PSI-BLAST as input instead of the single amino acid sequence. The introduction of secondary structure information predicted by PSIPRED also improved the prediction performance. Moreover, we handled the very uneven data set by combining the cost factor j with the "state-shifting" rule. This further promoted the prediction quality of our method. The final SVM model yielded a Matthews correlation coefficient (MCC) of 0.25 by a 10-fold cross-validation. To our knowledge, this MCC value is the highest obtained so far for predicting alpha-turns. An online Web server based on this method has been developed and can be freely accessed at http://bmc.hust.edu.cn/bioinformatics/ or http://210.42.106.80/.
我们开发了一种名为AlphaTurn的新方法,用于基于支持向量机(SVM)预测蛋白质中的α-转角。预测是在一个包含967个α-转角的469个非同源蛋白质数据集上进行的。通过使用PSI-BLAST生成的多序列比对作为输入,而不是单个氨基酸序列,预测性能有了很大提高。引入由PSIPRED预测的二级结构信息也提高了预测性能。此外,我们通过将成本因子j与“状态转移”规则相结合来处理非常不均衡的数据集。这进一步提高了我们方法的预测质量。最终的支持向量机模型通过10折交叉验证得到的马修斯相关系数(MCC)为0.25。据我们所知,这个MCC值是迄今为止预测α-转角所获得的最高值。基于此方法的在线网络服务器已经开发出来,可以通过http://bmc.hust.edu.cn/bioinformatics/ 或http://210.42.106.80/免费访问。