Meng Chaolu, Hu Yang, Zhang Ying, Guo Fei
College of Intelligence and Computing, Tianjin University, Tianjin, China.
College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.
Front Bioeng Biotechnol. 2020 Mar 31;8:245. doi: 10.3389/fbioe.2020.00245. eCollection 2020.
Polystyrene binding peptides (PSBPs) play a key role in the immobilization process. The correct identification of PSBPs is the first step of all related works. In this paper, we proposed a novel support vector machine-based bioinformatic identification model. This model contains four machine learning steps, including feature extraction, feature selection, model training and optimization. In a five-fold cross validation test, this model achieves 90.38, 84.62, 87.50, and 0.90% SN, SP, ACC, and AUC, respectively. The performance of this model outperforms the state-of-the-art identifier in terms of the SN and ACC with a smaller feature set. Furthermore, we constructed a web server that includes the proposed model, which is freely accessible at http://server.malab.cn/PSBP-SVM/index.jsp.
聚苯乙烯结合肽(PSBPs)在固定化过程中起着关键作用。正确识别PSBPs是所有相关工作的第一步。在本文中,我们提出了一种基于支持向量机的新型生物信息学识别模型。该模型包含四个机器学习步骤,包括特征提取、特征选择、模型训练和优化。在五折交叉验证测试中,该模型的灵敏度(SN)、特异度(SP)、准确率(ACC)和曲线下面积(AUC)分别达到90.38%、84.62%、87.50%和0.90%。在特征集较小的情况下,该模型在SN和ACC方面的性能优于现有最先进的标识符。此外,我们构建了一个包含所提出模型的网络服务器,可通过http://server.malab.cn/PSBP-SVM/index.jsp免费访问。