Ahmed Md Shakil, Shahjaman Md, Kabir Enamul, Kamruzzaman Md
Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh.
Department of Statistics, Begum Rokeya University, Rangpur-5400, Bangladesh.
Bioinformation. 2018 May 31;14(5):213-218. doi: 10.6026/97320630014213. eCollection 2018.
Lysine acetylation is one of the decisive categories of protein post-translational modification (PTM), it is convoluted in many significant cellular developments and severe diseases in the biological system. The experimental identification of protein-acetylated sites is painstaking, time-consuming and expensive. Hence, there is significant interest in the development of computational approaches for consistent prediction of acetylation sites using protein sequences. Features selection from protein sequences plays a significant role for acetylation sites prediction. We describe an improved feature selection approach for acetylation sites prediction based on kernel naive Bayes classifier (KNBC). We have shown that KNBC generated from selected features by a new feature selection method outperforms than the existing methods for identification of acetylation sites. The sensitivity, specificity, ACC (Accuracy), MCC (Matthews Correlation Coefficient) and AUC (Area under Curve of ROC) in our proposed method are as follows 80.71%, 93.39%, 76.73%, 41.37% and 83.0% with the optimum window size is 47. Thus the kernel naive Bayes classifier finds application in acetylation site prediction.
赖氨酸乙酰化是蛋白质翻译后修饰(PTM)的关键类型之一,它参与生物系统中许多重要的细胞过程和严重疾病。蛋白质乙酰化位点的实验鉴定既费力、耗时又昂贵。因此,人们对开发利用蛋白质序列一致预测乙酰化位点的计算方法有着浓厚兴趣。从蛋白质序列中选择特征对乙酰化位点预测起着重要作用。我们描述了一种基于核朴素贝叶斯分类器(KNBC)的用于乙酰化位点预测的改进特征选择方法。我们已经表明,通过一种新的特征选择方法从选定特征生成的KNBC在乙酰化位点识别方面优于现有方法。我们提出的方法中的灵敏度、特异性、ACC(准确率)、MCC(马修斯相关系数)和AUC(ROC曲线下面积)分别如下:80.71%、93.39%、76.73%、41.37%和83.0%,最佳窗口大小为47。因此,核朴素贝叶斯分类器在乙酰化位点预测中得到了应用。