• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于预测蛋白质翻译后修饰位点的高效机器学习框架。

An efficient machine-learning framework for predicting protein post-translational modification sites.

作者信息

Elreify Heba M, El-Samie Fathi E Abd, Dessouky Moawad I, Torkey Hanaa, El-Khamy Said E, Shalaby Wafaa A

机构信息

Department of Electronics and Electrical Communications Engineering, Faculty of Electronic Engineering, Menoufia University, Menouf 32952, Egypt.

Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia.

出版信息

Sci Rep. 2025 Aug 25;15(1):31179. doi: 10.1038/s41598-025-13178-x.

DOI:10.1038/s41598-025-13178-x
PMID:40854916
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12379237/
Abstract

Post-Translational Modifications (PTMs), particularly lysine 2-hydroxyisobutyrylation (Khib), represent critical regulatory mechanisms governing protein structure and function, with mounting evidence underscoring their important implications in cellular metabolism, transcriptional regulation, and pathological processes. Despite this significance, the experimental identification of Khib sites remains constrained by resource-intensive methodologies and the transient nature of these modifications. To overcome these limitations, we introduce HyLightKhib, a computational framework that leverages Light Gradient Boosting Machine architecture for accurate Khib site prediction. Our approach depends on a hybrid feature extraction strategy, integrating Evolutionary Scale Modeling (ESM-2) embeddings with comprehensive Composition, Transition, and Distribution (CTD) descriptors as well as curated amino acid physicochemical properties for fixed-length peptides of 43 amino acids. The proposed classifier demonstrated considerable performance over contemporary algorithms, including XGBoost and CatBoostimplementations through mutual information-based feature selection optimization. Cross-species validation on diverse organisms including, human, parasite , and rice achieved improved Area Under the Receiver Operating Characteristic Curve (AUC-ROC) scores of 0.893, 0.876, and 0.847, respectively, outperforming existing predictors, such as DeepKhib, and ResNetKhib. HyLightKhib represents an advancement in computational PTM prediction, providing enhanced predictive performance and valuable biological insights with direct implications for functional proteomics and PTM-targeted therapies.

摘要

翻译后修饰(PTMs),尤其是赖氨酸2-羟基异丁酰化(Khib),是调控蛋白质结构和功能的关键机制,越来越多的证据表明它们在细胞代谢、转录调控和病理过程中具有重要意义。尽管具有如此重要的意义,但Khib位点的实验鉴定仍然受到资源密集型方法以及这些修饰的瞬时性质的限制。为了克服这些限制,我们引入了HyLightKhib,这是一个计算框架,利用轻梯度提升机架构进行准确的Khib位点预测。我们的方法依赖于一种混合特征提取策略,将进化尺度建模(ESM-2)嵌入与综合的组成、转换和分布(CTD)描述符以及43个氨基酸的固定长度肽的精选氨基酸物理化学性质相结合。通过基于互信息的特征选择优化,所提出的分类器在包括XGBoost和CatBoost实现在内的当代算法上表现出了相当出色的性能。在包括人类、寄生虫和水稻在内的多种生物体上进行的跨物种验证分别实现了改进的受试者操作特征曲线下面积(AUC-ROC)分数,分别为0.893、0.876和0.847,优于现有的预测器,如DeepKhib和ResNetKhib。HyLightKhib代表了计算PTM预测方面的一项进展,提供了增强的预测性能和有价值的生物学见解,对功能蛋白质组学和PTM靶向治疗具有直接意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/2d6659261006/41598_2025_13178_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/fa7ab8cbb711/41598_2025_13178_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/17e05a8f3813/41598_2025_13178_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/777411b794c6/41598_2025_13178_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e154653187f4/41598_2025_13178_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/73dc312834be/41598_2025_13178_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/65ddc5988deb/41598_2025_13178_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/3a187f2c51ae/41598_2025_13178_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e61e051c81a4/41598_2025_13178_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/91c2e1594970/41598_2025_13178_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e15d3fa1091e/41598_2025_13178_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/2d6659261006/41598_2025_13178_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/fa7ab8cbb711/41598_2025_13178_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/17e05a8f3813/41598_2025_13178_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/777411b794c6/41598_2025_13178_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e154653187f4/41598_2025_13178_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/73dc312834be/41598_2025_13178_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/65ddc5988deb/41598_2025_13178_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/3a187f2c51ae/41598_2025_13178_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e61e051c81a4/41598_2025_13178_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/91c2e1594970/41598_2025_13178_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/e15d3fa1091e/41598_2025_13178_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03c4/12379237/2d6659261006/41598_2025_13178_Fig11_HTML.jpg

相似文献

1
An efficient machine-learning framework for predicting protein post-translational modification sites.一种用于预测蛋白质翻译后修饰位点的高效机器学习框架。
Sci Rep. 2025 Aug 25;15(1):31179. doi: 10.1038/s41598-025-13178-x.
2
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型:基于多中心队列研究的开发与验证研究
J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.
3
ResNetKhib: a novel cell type-specific tool for predicting lysine 2-hydroxyisobutylation sites via transfer learning.ResNetKhib:一种通过迁移学习预测赖氨酸 2-羟基异丁酰化位点的新型细胞类型特异性工具。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad063.
4
Large Language Model (LLM)-Based Advances in Prediction of Post-translational Modification Sites in Proteins.基于大语言模型(LLM)在蛋白质翻译后修饰位点预测方面的进展。
Methods Mol Biol. 2025;2941:313-355. doi: 10.1007/978-1-0716-4623-6_19.
5
Advancing the Accuracy of Anti-MRSA Peptide Prediction Through Integrating Multi-Source Protein Language Models.通过整合多源蛋白质语言模型提高抗耐甲氧西林金黄色葡萄球菌肽预测的准确性
Interdiscip Sci. 2025 Mar 11. doi: 10.1007/s12539-025-00696-5.
6
Automated feature learning and survival prognostication in grade 4 glioma using supervised machine learning models.使用监督式机器学习模型对四级胶质瘤进行自动特征学习和生存预后分析。
J Neurooncol. 2025 Jun 16. doi: 10.1007/s11060-025-05099-6.
7
Optimized feature selection and advanced machine learning for stroke risk prediction in revascularized coronary artery disease patients.优化特征选择与先进机器学习用于预测冠状动脉疾病血运重建患者的卒中风险
BMC Med Inform Decis Mak. 2025 Jul 24;25(1):276. doi: 10.1186/s12911-025-03116-2.
8
Machine learning based screening of biomarkers associated with cell death and immunosuppression of multiple life stages sepsis populations.基于机器学习对与多生命阶段脓毒症人群细胞死亡和免疫抑制相关生物标志物的筛选。
Sci Rep. 2025 Aug 19;15(1):30302. doi: 10.1038/s41598-025-14600-0.
9
Parsimonious and explainable machine learning for predicting mortality in patients post hip fracture surgery.用于预测髋部骨折手术后患者死亡率的简约且可解释的机器学习方法。
Sci Rep. 2025 Jul 2;15(1):22922. doi: 10.1038/s41598-025-98713-6.
10
Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation.关于使用人工智能评估临床数据完整性并生成元数据的提案:算法开发与验证
JMIR Med Inform. 2025 Jun 30;13:e60204. doi: 10.2196/60204.

本文引用的文献

1
pACP-HybDeep: predicting anticancer peptides using binary tree growth based transformer and structural feature encoding with deep-hybrid learning.pACP-HybDeep:基于二叉树生长的变压器和深度混合学习的结构特征编码预测抗癌肽
Sci Rep. 2025 Jan 2;15(1):565. doi: 10.1038/s41598-024-84146-0.
2
Protein A-like Peptide Design Based on Diffusion and ESM2 Models.基于扩散和 ESM2 模型的蛋白 A 样肽设计。
Molecules. 2024 Oct 21;29(20):4965. doi: 10.3390/molecules29204965.
3
Protein representations: Encoding biological information for machine learning in biocatalysis.
蛋白质表示:生物催化机器学习中的生物信息编码。
Biotechnol Adv. 2024 Dec;77:108459. doi: 10.1016/j.biotechadv.2024.108459. Epub 2024 Oct 2.
4
Current computational tools for protein lysine acylation site prediction.当前用于预测蛋白质赖氨酸酰化位点的计算工具。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae469.
5
Current Technologies Unraveling the Significance of Post-Translational Modifications (PTMs) as Crucial Players in Neurodegeneration.当前技术揭示了翻译后修饰(PTMs)作为神经退行性变关键因素的重要性。
Biomolecules. 2024 Jan 16;14(1):118. doi: 10.3390/biom14010118.
6
Machine learning-based approaches for ubiquitination site prediction in human proteins.基于机器学习的人类蛋白质泛素化位点预测方法。
BMC Bioinformatics. 2023 Nov 28;24(1):449. doi: 10.1186/s12859-023-05581-w.
7
Inflammatory response-based prognostication and personalized therapy decisions in clear cell renal cell cancer to aid precision oncology.基于炎症反应的预测和透明细胞肾细胞癌的个体化治疗决策,以辅助精准肿瘤学。
BMC Med Genomics. 2023 Oct 26;16(1):265. doi: 10.1186/s12920-023-01687-5.
8
Protein posttranslational modifications in health and diseases: Functions, regulatory mechanisms, and therapeutic implications.健康与疾病中的蛋白质翻译后修饰:功能、调控机制及治疗意义
MedComm (2020). 2023 May 2;4(3):e261. doi: 10.1002/mco2.261. eCollection 2023 Jun.
9
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
10
ResNetKhib: a novel cell type-specific tool for predicting lysine 2-hydroxyisobutylation sites via transfer learning.ResNetKhib:一种通过迁移学习预测赖氨酸 2-羟基异丁酰化位点的新型细胞类型特异性工具。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad063.