Suppr超能文献

基于 XGBoost 的机器学习架构,用于预测赖氨酸 β-羟丁酸酰化(Kbhb)修饰位点。

KbhbXG: A Machine learning architecture based on XGBoost for prediction of lysine β-Hydroxybutyrylation (Kbhb) modification sites.

机构信息

Department of Statistics, University of Science and Technology Beijing, Beijing 100083, China.

The Open University of China, Beijing 100039, China.

出版信息

Methods. 2024 Jul;227:27-34. doi: 10.1016/j.ymeth.2024.04.016. Epub 2024 Apr 27.

Abstract

Lysine β-hydroxybutyrylation is an important post-translational modification (PTM) involved in various physiological and biological processes. In this research, we introduce a novel predictor KbhbXG, which utilizes XGBoost to identify β-hydroxybutyrylation modification sites based on protein sequence information. The traditional experimental methods employed for the identification of β-hydroxybutyrylated sites using proteomic techniques are both costly and time-consuming. Thus, the development of computational methods and predictors can play a crucial role in facilitating the rapid identification of β-hydroxybutyrylation sites. Our proposed KbhbXG model first utilizes machine learning algorithm XGBoost to predict β-hydroxybutyrylation modification sites. On the independent test set, KbhbXG achieves an accuracy of 0.7457, specificity of 0.7771, and an impressive area under the curve (AUC) score of 0.8172. The high AUC score achieved by our method demonstrates its potential for effectively identifying novel β-hydroxybutyrylation sites, thereby facilitating further research and exploration of the β-hydroxybutyrylation process. Also, functional analyses have revealed that different organisms preferentially engage in distinct biological processes and pathways, which can provide valuable insights for understanding the mechanism of β-hydroxybutyrylation and guide experimental verification. To promote transparency and reproducibility, we have made both the codes and dataset of KbhbXG publicly available. Researchers interested in utilizing our proposed model can access these resources at https://github.com/Lab-Xu/KbhbXG.

摘要

赖氨酸β-羟丁酸酰化是一种参与多种生理和生物过程的重要翻译后修饰(PTM)。在这项研究中,我们引入了一种新的预测器 KbhbXG,它利用 XGBoost 根据蛋白质序列信息识别β-羟丁酸酰化修饰位点。使用蛋白质组学技术鉴定β-羟丁酸酰化位点的传统实验方法既昂贵又耗时。因此,开发计算方法和预测器可以在促进β-羟丁酸酰化位点的快速鉴定方面发挥关键作用。我们提出的 KbhbXG 模型首先利用机器学习算法 XGBoost 预测β-羟丁酸酰化修饰位点。在独立测试集中,KbhbXG 实现了 0.7457 的准确率、0.7771 的特异性和令人印象深刻的 0.8172 的曲线下面积(AUC)评分。我们的方法实现了高 AUC 评分,这表明它具有有效识别新的β-羟丁酸酰化位点的潜力,从而促进了对β-羟丁酸酰化过程的进一步研究和探索。此外,功能分析表明,不同的生物体优先参与不同的生物过程和途径,这可以为理解β-羟丁酸酰化的机制提供有价值的见解,并指导实验验证。为了提高透明度和可重复性,我们公开了 KbhbXG 的代码和数据集。有兴趣使用我们提出的模型的研究人员可以在 https://github.com/Lab-Xu/KbhbXG 访问这些资源。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验