College of Chemistry, Sichuan University, Chengdu 610064, PR China.
Comput Biol Med. 2012 Nov;42(11):1053-9. doi: 10.1016/j.compbiomed.2012.08.005. Epub 2012 Sep 14.
Flavin mono-nucleotide (FMN) closely evolves in many biological processes. In this study, a computational method was proposed to identify FMN binding sites based on amino acid sequences of proteins only. A modified Position Specific Score Matrix was used to characterize the local environmental sequence information, and a visible improvement of performance was obtained. Also, the ensemble SVM was applied to solve the imbalanced data problem. Additionally, an independent dataset was built to evaluate the practical performance of the method, and a satisfactory accuracy of 87.87% was achieved. It demonstrates that the method is effective in predicting FMN-binding sites.
黄素单核苷酸(FMN)在许多生物过程中密切进化。在这项研究中,提出了一种仅基于蛋白质氨基酸序列识别 FMN 结合位点的计算方法。使用改进的位置特异性评分矩阵来描述局部环境序列信息,从而获得了性能的显著提高。此外,还应用了集成 SVM 来解决不平衡数据问题。此外,构建了一个独立的数据集来评估该方法的实际性能,达到了 87.87%的令人满意的准确率。这表明该方法在预测 FMN 结合位点方面是有效的。