Suppr超能文献

一种用于识别组蛋白和非组蛋白蛋白质上乙酰化赖氨酸的智能系统。

An intelligent system for identifying acetylated lysine on histones and nonhistone proteins.

作者信息

Lu Cheng-Tsung, Lee Tzong-Yi, Chen Yu-Ju, Chen Yi-Ju

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan.

Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan.

出版信息

Biomed Res Int. 2014;2014:528650. doi: 10.1155/2014/528650. Epub 2014 Jul 24.

Abstract

Lysine acetylation is an important and ubiquitous posttranslational modification conserved in prokaryotes and eukaryotes. This process, which is dynamically and temporally regulated by histone acetyltransferases and deacetylases, is crucial for numerous essential biological processes such as transcriptional regulation, cellular signaling, and stress response. Since the experimental identification of lysine acetylation sites within proteins is time-consuming and laboratory-intensive, several computational approaches have been developed to identify candidates for experimental validation. In this work, acetylated protein data collected from UniProtKB were categorized into histone or nonhistone proteins. Support vector machines (SVMs) were applied to build predictive models by using amino acid pair composition (AAPC) as a feature in a histone model. We combined BLOSUM62 and AAPC features in a nonhistone model. Furthermore, using maximal dependence decomposition (MDD) clustering can enhance the performance of the model on a fivefold cross-validation evaluation to yield a sensitivity of 0.863, specificity of 0.885, accuracy of 0.880, and MCC of 0.706. Additionally, the proposed method is evaluated using independent test sets resulting in a predictive accuracy of 74%. This indicates that the performance of our method is comparable with that of other acetylation prediction methods.

摘要

赖氨酸乙酰化是原核生物和真核生物中一种重要且普遍存在的翻译后修饰。这个过程由组蛋白乙酰转移酶和去乙酰化酶动态且适时地调控,对于众多重要的生物学过程至关重要,如转录调控、细胞信号传导和应激反应。由于实验鉴定蛋白质中的赖氨酸乙酰化位点既耗时又需要大量实验室工作,因此已经开发了几种计算方法来识别用于实验验证的候选位点。在这项工作中,从UniProtKB收集的乙酰化蛋白质数据被分类为组蛋白或非组蛋白。支持向量机(SVM)被用于构建预测模型,在组蛋白模型中使用氨基酸对组成(AAPC)作为特征。在非组蛋白模型中,我们将BLOSUM62和AAPC特征相结合。此外,使用最大依赖分解(MDD)聚类可以在五重交叉验证评估中提高模型的性能,从而得到灵敏度为0.863、特异性为0.885、准确率为0.880和马修斯相关系数为0.706的结果。此外,使用独立测试集对所提出的方法进行评估,预测准确率为74%。这表明我们方法的性能与其他乙酰化预测方法相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/46b6/4132336/c8ea2513cace/BMRI2014-528650.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验