School of Mathematical Sciences, Ocean University of China, Qingdao 266100, PR China.
College of Information Science and Engineering, Ocean University of China, Qingdao 266100, PR China.
Genomics. 2020 Mar;112(2):1941-1946. doi: 10.1016/j.ygeno.2019.11.006. Epub 2019 Nov 15.
In this paper, a step-by-step classification algorithm based on double-layer SVM model is constructed to predict the secondary structure of proteins. The most important feature of this algorithm is to improve the prediction accuracy of α+β and α/β classes through transforming the prediction of two classes of proteins, α+β and α/β classes, with low accuracy in the past, into the prediction of all-α and all-β classes with high accuracy. A widely-used dataset, 25PDB dataset with sequence similarity lower than 40%, is used to evaluate this method. The results show that this method has good performance, and on the basis of ensuring the accuracy of other three structural classes of proteins, the accuracy of α+β class proteins is improved significantly.
本文构建了一种基于双层 SVM 模型的逐步分类算法,用于预测蛋白质的二级结构。该算法的最重要特点是,通过将过去准确率较低的两类蛋白质(α+β 和 α/β 类)的预测转换为准确率较高的全-α和全-β类的预测,从而提高了对 α+β 和 α/β 类的预测准确性。该方法使用了一个广泛使用的数据集,即序列相似度低于 40%的 25PDB 数据集进行评估。结果表明,该方法具有良好的性能,在保证其他三种结构类蛋白质准确性的基础上,显著提高了 α+β 类蛋白质的准确性。