• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PBertKla:一种用于预测人类赖氨酸乳酰化位点的蛋白质大语言模型。

PBertKla: a protein large language model for predicting human lysine lactylation sites.

作者信息

Lai Hongyan, Luo Diyu, Yang Mi, Zhu Tao, Yang Huan, Luo Xinwei, Wei Yijie, Xie Sijia, Hong Feitong, Shu Kunxian, Dao Fuying, Ding Hui

机构信息

Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.

Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.

出版信息

BMC Biol. 2025 Apr 7;23(1):95. doi: 10.1186/s12915-025-02202-1.

DOI:10.1186/s12915-025-02202-1
PMID:40189537
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11974188/
Abstract

BACKGROUND

Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues of both histones and non-histones to exert diverse effects on target proteins. Research has shown that lysine lactylation (Kla) modification is ubiquitous in different cells and participates in the determination of cell function and fate, as well as in the initiation and progression of various diseases. Precise identification of Kla sites is fundamental for elucidating their biological functions and uncovering their application potential.

RESULTS

Here, we proposed a novel human Kla site predictor (named PBertKla) through curating a reliable benchmark dataset with proper sample length and sequence identity threshold to train a protein large language model with optimal hyperparameters. Extensive experimental results consistently demonstrated that our model possessed robust human Kla site prediction ability, achieving an AUC (area under receiver operating characteristic curve) value of over 0.880 on the independent validation data. Feature visualization analysis further validated the effectiveness of in feature learning and representation from Kla sequences. Moreover, we benchmarked PBertKla against other cutting-edge models on an independent testing dataset from different sources, highlighting its superiority and transferability.

CONCLUSIONS

All results indicated that PBertKla excelled as an automatic predictor of human Kla sites, and it would advance the investigation of lactylation modifications and their significance in health and disease.

摘要

背景

乳酰化是一种新发现的翻译后修饰类型,主要发生在组蛋白和非组蛋白的赖氨酸(K)残基上,对靶蛋白产生多种影响。研究表明,赖氨酸乳酰化(Kla)修饰在不同细胞中普遍存在,参与细胞功能和命运的决定,以及各种疾病的发生和发展。准确鉴定Kla位点对于阐明其生物学功能和揭示其应用潜力至关重要。

结果

在此,我们通过精心策划一个具有适当样本长度和序列同一性阈值的可靠基准数据集,以训练具有最佳超参数的蛋白质大语言模型,提出了一种新型的人类Kla位点预测器(名为PBertKla)。广泛的实验结果一致表明,我们的模型具有强大的人类Kla位点预测能力,在独立验证数据上的AUC(受试者工作特征曲线下面积)值超过0.880。特征可视化分析进一步验证了从Kla序列进行特征学习和表示的有效性。此外,我们在来自不同来源的独立测试数据集上,将PBertKla与其他前沿模型进行了基准测试,突出了其优越性和可转移性。

结论

所有结果表明,PBertKla作为人类Kla位点的自动预测器表现出色,它将推动对乳酰化修饰及其在健康和疾病中的意义的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/b17b0ae6c4f9/12915_2025_2202_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/623ecd9b249b/12915_2025_2202_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/ffb8ad22c619/12915_2025_2202_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/234b6f5a4283/12915_2025_2202_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/1f6e0644408d/12915_2025_2202_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/420e333946ce/12915_2025_2202_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/b17b0ae6c4f9/12915_2025_2202_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/623ecd9b249b/12915_2025_2202_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/ffb8ad22c619/12915_2025_2202_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/234b6f5a4283/12915_2025_2202_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/1f6e0644408d/12915_2025_2202_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/420e333946ce/12915_2025_2202_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a25/11974188/b17b0ae6c4f9/12915_2025_2202_Fig6_HTML.jpg

相似文献

1
PBertKla: a protein large language model for predicting human lysine lactylation sites.PBertKla:一种用于预测人类赖氨酸乳酰化位点的蛋白质大语言模型。
BMC Biol. 2025 Apr 7;23(1):95. doi: 10.1186/s12915-025-02202-1.
2
FSL-Kla: A few-shot learning-based multi-feature hybrid system for lactylation site prediction.FSL-Kla:一种基于少样本学习的用于乳酰化位点预测的多特征混合系统。
Comput Struct Biotechnol J. 2021 Aug 10;19:4497-4509. doi: 10.1016/j.csbj.2021.08.013. eCollection 2021.
3
Lactylation prediction models based on protein sequence and structural feature fusion.基于蛋白质序列和结构特征融合的乳酰化预测模型。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbad539.
4
DeepKlapred: A deep learning framework for identifying protein lysine lactylation sites via multi-view feature fusion.DeepKlapred:一种通过多视图特征融合识别蛋白质赖氨酸乳酰化位点的深度学习框架。
Int J Biol Macromol. 2024 Dec;283(Pt 3):137668. doi: 10.1016/j.ijbiomac.2024.137668. Epub 2024 Nov 19.
5
Global Profiling of Lysine Acetylation and Lactylation in Kupffer Cells.在库普弗细胞中赖氨酸乙酰化和乳酰化的全局分析。
J Proteome Res. 2023 Dec 1;22(12):3683-3691. doi: 10.1021/acs.jproteome.3c00156. Epub 2023 Oct 28.
6
Auto-Kla: a novel web server to discriminate lysine lactylation sites using automated machine learning.Auto-Kla:一种使用自动化机器学习识别赖氨酸乳酰化位点的新型网络服务器。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad070.
7
Lysine lactylation-based insight to understanding the characterization of cervical cancer.赖氨酸酰化的研究进展有助于了解宫颈癌的特征。
Biochim Biophys Acta Mol Basis Dis. 2024 Oct;1870(7):167356. doi: 10.1016/j.bbadis.2024.167356. Epub 2024 Jul 16.
8
Subcellular Proteomic Mapping of Lysine Lactylation.赖氨酸乳酰化的亚细胞蛋白质组图谱
J Am Soc Mass Spectrom. 2024 Dec 4;35(12):3221-3232. doi: 10.1021/jasms.4c00366. Epub 2024 Nov 21.
9
Lactylome analyses suggest systematic lysine-lactylated substrates in oral squamous cell carcinoma under normoxia and hypoxia.乳酰组分析表明,在常氧和缺氧条件下,口腔鳞状细胞癌中有系统的赖氨酸乳酰化底物。
Cell Signal. 2024 Aug;120:111228. doi: 10.1016/j.cellsig.2024.111228. Epub 2024 May 17.
10
Ubiquitous protein lactylation in health and diseases.健康与疾病中的普遍存在的蛋白质乳酰化
Cell Mol Biol Lett. 2024 Feb 5;29(1):23. doi: 10.1186/s11658-024-00541-5.

引用本文的文献

1
Hyperhomocysteinemia in chronic schizophrenia: prevalence, clinical correlates, and paradoxical associations with symptom severity.慢性精神分裂症中的高同型半胱氨酸血症:患病率、临床相关性以及与症状严重程度的矛盾关联
Eur Arch Psychiatry Clin Neurosci. 2025 Sep 9. doi: 10.1007/s00406-025-02106-9.
2
Lactylation in tumor: mechanisms and therapeutic potentials.肿瘤中的乳酸化:机制与治疗潜力
Front Immunol. 2025 Jun 16;16:1609596. doi: 10.3389/fimmu.2025.1609596. eCollection 2025.

本文引用的文献

1
Simulating 500 million years of evolution with a language model.用语言模型模拟5亿年的进化历程。
Science. 2025 Feb 21;387(6736):850-858. doi: 10.1126/science.ads0018. Epub 2025 Jan 16.
2
PlantEMS: A comprehensive database of epigenetic modification sites across multiple plant species.植物表观遗传修饰位点综合数据库(PlantEMS):涵盖多个植物物种的表观遗传修饰位点综合数据库。
Plant Commun. 2025 Apr 14;6(4):101228. doi: 10.1016/j.xplc.2024.101228. Epub 2024 Dec 20.
3
Accurate RNA velocity estimation based on multibatch network reveals complex lineage in batch scRNA-seq data.
基于多批次网络的准确RNA速度估计揭示了批次单细胞RNA测序数据中的复杂谱系。
BMC Biol. 2024 Dec 18;22(1):290. doi: 10.1186/s12915-024-02085-8.
4
DeepKlapred: A deep learning framework for identifying protein lysine lactylation sites via multi-view feature fusion.DeepKlapred:一种通过多视图特征融合识别蛋白质赖氨酸乳酰化位点的深度学习框架。
Int J Biol Macromol. 2024 Dec;283(Pt 3):137668. doi: 10.1016/j.ijbiomac.2024.137668. Epub 2024 Nov 19.
5
Lactate and lysine lactylation of histone regulate transcription in cancer.组蛋白的乳酸化和赖氨酸乳酰化调控癌症中的转录。
Heliyon. 2024 Oct 1;10(21):e38426. doi: 10.1016/j.heliyon.2024.e38426. eCollection 2024 Nov 15.
6
Multi-omics reveals lactylation-driven regulatory mechanisms promoting tumor progression in oral squamous cell carcinoma.多组学揭示了乳酰化驱动的调控机制,促进口腔鳞状细胞癌的肿瘤进展。
Genome Biol. 2024 Oct 15;25(1):272. doi: 10.1186/s13059-024-03383-8.
7
Lactylation in cancer: Current understanding and challenges.癌症中的乳酰化:当前的认识和挑战。
Cancer Cell. 2024 Nov 11;42(11):1803-1807. doi: 10.1016/j.ccell.2024.09.006. Epub 2024 Oct 10.
8
Post-Translational Modifications of RNA-Modifying Proteins in Cellular Dynamics and Disease Progression.RNA 修饰蛋白的翻译后修饰在细胞动态和疾病进展中的作用。
Adv Sci (Weinh). 2024 Nov;11(44):e2406318. doi: 10.1002/advs.202406318. Epub 2024 Oct 8.
9
An Overview of Research Advances in Oncology Regarding the Transcription Factor ATF4.肿瘤学中关于转录因子ATF4的研究进展综述
Curr Drug Targets. 2025;26(1):59-72. doi: 10.2174/0113894501328461240921062056.
10
Advances in Protein-Ligand Binding Affinity Prediction via Deep Learning: A Comprehensive Study of Datasets, Data Preprocessing Techniques, and Model Architectures.基于深度学习的蛋白质-配体结合亲和力预测方法进展:数据集、数据预处理技术和模型架构的综合研究。
Curr Drug Targets. 2024;25(15):1041-1065. doi: 10.2174/0113894501330963240905083020.