Yu Bin, Lou Lifeng, Li Shan, Zhang Yusen, Qiu Wenying, Wu Xue, Wang Minghui, Tian Baoguang
College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China; CAS Key Laboratory of Geospace Environment, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026, China; Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China.
College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China.
J Mol Graph Model. 2017 Sep;76:260-273. doi: 10.1016/j.jmgm.2017.07.012. Epub 2017 Jul 14.
Prediction of protein structural class plays an important role in protein structure and function analysis, drug design and many other biological applications. Prediction of protein structural class for low-similarity sequences is still a challenging task. Based on the theory of wavelet denoising, this paper presents a novel method of prediction of protein structural class for the first time. Firstly, the features of the protein sequence are extracted by using Chou's pseudo amino acid composition (PseAAC). Then the extracted feature information is denoised by two-dimensional (2D) wavelet. Finally, the optimal feature vectors are input to support vector machine (SVM) classifier to predict protein structural classes. We obtained significant predictive results using jackknife test on three low-similarity protein structural class datasets 25PDB, 1189 and 640, and compared our method with previous methods The results indicate that the method proposed in this paper can effectively improve the prediction accuracy of protein structural class, which will be a reliable tool for prediction of protein structural class, especially for low-similarity sequences.
蛋白质结构类别的预测在蛋白质结构与功能分析、药物设计及许多其他生物学应用中发挥着重要作用。对低相似性序列的蛋白质结构类别进行预测仍然是一项具有挑战性的任务。基于小波去噪理论,本文首次提出了一种预测蛋白质结构类别的新方法。首先,利用周的伪氨基酸组成(PseAAC)提取蛋白质序列的特征。然后,通过二维(2D)小波对提取的特征信息进行去噪。最后,将最优特征向量输入支持向量机(SVM)分类器来预测蛋白质结构类别。我们在三个低相似性蛋白质结构类数据集25PDB、1189和640上使用留一法检验获得了显著的预测结果,并将我们的方法与先前的方法进行了比较。结果表明,本文提出的方法能够有效提高蛋白质结构类别的预测准确率,这将成为预测蛋白质结构类别的可靠工具,特别是对于低相似性序列。