Mbah Andreas N
Center for Bioinformatics & Computational Biology, Department of Biology, Jackson State University, Jackson, MS 39217, USA.
ISRN Comput Biol. 2014 Jan 8;2014:581245. doi: 10.1155/2014/581245.
The ATP binding proteins exist as a hybrid of proteins with Walker A motif and universal stress proteins (USPs) having an alternative motif for binding ATP. There is an urgent need to find a reliable and comprehensive hybrid predictor for ATP binding proteins using whole sequence information. In this paper the open source LIBSVM toolbox was used to build a classifier at 10-fold cross-validation. The best hybrid model was the combination of amino acid and dipeptide composition with an accuracy of 84.57% and Mathews correlation coefficient (MCC) value of 0.693. This classifier proves to be better than many classical ATP binding protein predictors. The general trend observed is that combinations of descriptors performed better and improved the overall performances of individual descriptors, particularly when combined with amino acid composition. The work developed a comprehensive model for predicting ATP binding proteins irrespective of their functional motifs. This model provides a high probability of success for molecular biologists in predicting and selecting diverse groups of ATP binding proteins irrespective of their functional motifs.
ATP结合蛋白以具有沃克A基序的蛋白质与具有结合ATP的替代基序的通用应激蛋白(USP)的混合形式存在。迫切需要使用全序列信息找到一种可靠且全面的ATP结合蛋白混合预测器。在本文中,开源的LIBSVM工具箱用于在10折交叉验证下构建分类器。最佳混合模型是氨基酸和二肽组成的组合,准确率为84.57%,马修斯相关系数(MCC)值为0.693。该分类器被证明优于许多经典的ATP结合蛋白预测器。观察到的总体趋势是,描述符的组合表现更好,并提高了各个描述符的整体性能,特别是与氨基酸组成相结合时。这项工作开发了一个全面的模型,用于预测ATP结合蛋白,而不考虑其功能基序。该模型为分子生物学家预测和选择不同组的ATP结合蛋白提供了很高的成功概率,而不考虑其功能基序。