Ju Zhe, He Jian-Jun
College of Science, Shenyang Aerospace University, 110136, People's Republic of China.
College of Information and Communication Engineering, Dalian Minzu University, 116600, People's Republic of China.
Anal Biochem. 2018 Jun 1;550:1-7. doi: 10.1016/j.ab.2018.04.005. Epub 2018 Apr 8.
Lysine glutarylation is new type of protein acylation modification in both prokaryotes and eukaryotes. To better understand the molecular mechanism of glutarylation, it is important to identify glutarylated substrates and their corresponding glutarylation sites accurately. In this study, a novel bioinformatics tool named GlutPred is developed to predict glutarylation sites by using multiple feature extraction and maximum relevance minimum redundancy feature selection. On the one hand, amino acid factors, binary encoding, and the composition of k-spaced amino acid pairs features are incorporated to encode glutarylation sites. And the maximum relevance minimum redundancy method and the incremental feature selection algorithm are adopted to remove the redundant features. On the other hand, a biased support vector machine algorithm is used to handle the imbalanced problem in glutarylation sites training dataset. As illustrated by 10-fold cross-validation, the performance of GlutPred achieves a satisfactory performance with a Sensitivity of 64.80%, a Specificity of 76.60%, an Accuracy of 74.90% and a Matthew's correlation coefficient of 0.3194. Feature analysis shows that some k-spaced amino acid pair features play the most important roles in the prediction of glutarylation sites. The conclusions derived from this study might provide some clues for understanding the molecular mechanisms of glutarylation.
赖氨酸戊二酰化是原核生物和真核生物中一种新型的蛋白质酰化修饰。为了更好地理解戊二酰化的分子机制,准确鉴定戊二酰化底物及其相应的戊二酰化位点非常重要。在本研究中,开发了一种名为GlutPred的新型生物信息学工具,通过使用多特征提取和最大相关最小冗余特征选择来预测戊二酰化位点。一方面,纳入氨基酸因子、二进制编码和k间隔氨基酸对特征的组成来编码戊二酰化位点。并采用最大相关最小冗余方法和增量特征选择算法去除冗余特征。另一方面,使用有偏支持向量机算法来处理戊二酰化位点训练数据集中的不平衡问题。如10折交叉验证所示,GlutPred的性能达到了令人满意的水平,灵敏度为64.80%,特异性为76.60%,准确率为74.90%,马修斯相关系数为0.3194。特征分析表明,一些k间隔氨基酸对特征在戊二酰化位点的预测中起最重要作用。本研究得出的结论可能为理解戊二酰化的分子机制提供一些线索。