Lu Cheng-Tsung, Lee Tzong-Yi, Chen Yu-Ju, Chen Yi-Ju
Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan.
Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan.
Biomed Res Int. 2014;2014:528650. doi: 10.1155/2014/528650. Epub 2014 Jul 24.
Lysine acetylation is an important and ubiquitous posttranslational modification conserved in prokaryotes and eukaryotes. This process, which is dynamically and temporally regulated by histone acetyltransferases and deacetylases, is crucial for numerous essential biological processes such as transcriptional regulation, cellular signaling, and stress response. Since the experimental identification of lysine acetylation sites within proteins is time-consuming and laboratory-intensive, several computational approaches have been developed to identify candidates for experimental validation. In this work, acetylated protein data collected from UniProtKB were categorized into histone or nonhistone proteins. Support vector machines (SVMs) were applied to build predictive models by using amino acid pair composition (AAPC) as a feature in a histone model. We combined BLOSUM62 and AAPC features in a nonhistone model. Furthermore, using maximal dependence decomposition (MDD) clustering can enhance the performance of the model on a fivefold cross-validation evaluation to yield a sensitivity of 0.863, specificity of 0.885, accuracy of 0.880, and MCC of 0.706. Additionally, the proposed method is evaluated using independent test sets resulting in a predictive accuracy of 74%. This indicates that the performance of our method is comparable with that of other acetylation prediction methods.
赖氨酸乙酰化是原核生物和真核生物中一种重要且普遍存在的翻译后修饰。这个过程由组蛋白乙酰转移酶和去乙酰化酶动态且适时地调控,对于众多重要的生物学过程至关重要,如转录调控、细胞信号传导和应激反应。由于实验鉴定蛋白质中的赖氨酸乙酰化位点既耗时又需要大量实验室工作,因此已经开发了几种计算方法来识别用于实验验证的候选位点。在这项工作中,从UniProtKB收集的乙酰化蛋白质数据被分类为组蛋白或非组蛋白。支持向量机(SVM)被用于构建预测模型,在组蛋白模型中使用氨基酸对组成(AAPC)作为特征。在非组蛋白模型中,我们将BLOSUM62和AAPC特征相结合。此外,使用最大依赖分解(MDD)聚类可以在五重交叉验证评估中提高模型的性能,从而得到灵敏度为0.863、特异性为0.885、准确率为0.880和马修斯相关系数为0.706的结果。此外,使用独立测试集对所提出的方法进行评估,预测准确率为74%。这表明我们方法的性能与其他乙酰化预测方法相当。