大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

机构信息

School of Basic Medical Science, Qingdao University, Dengzhou Road, Qingdao, Shandong, China.

Medicinal Chemistry, Leiden Academic Centre for Drug Research,Einsteinweg, Leiden, The Netherlands.

出版信息

Brief Bioinform. 2019 Nov 27;20(6):2267-2290. doi: 10.1093/bib/bby089.

DOI:10.1093/bib/bby089

PMID:30285084

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6954452/

Abstract

Lysine post-translational modifications (PTMs) play a crucial role in regulating diverse functions and biological processes of proteins. However, because of the large volumes of sequencing data generated from genome-sequencing projects, systematic identification of different types of lysine PTM substrates and PTM sites in the entire proteome remains a major challenge. In recent years, a number of computational methods for lysine PTM identification have been developed. These methods show high diversity in their core algorithms, features extracted and feature selection techniques and evaluation strategies. There is therefore an urgent need to revisit these methods and summarize their methodologies, to improve and further develop computational techniques to identify and characterize lysine PTMs from the large amounts of sequence data. With this goal in mind, we first provide a comprehensive survey on a large collection of 49 state-of-the-art approaches for lysine PTM prediction. We cover a variety of important aspects that are crucial for the development of successful predictors, including operating algorithms, sequence and structural features, feature selection, model performance evaluation and software utility. We further provide our thoughts on potential strategies to improve the model performance. Second, in order to examine the feasibility of using deep learning for lysine PTM prediction, we propose a novel computational framework, termed MUscADEL (Multiple Scalable Accurate Deep Learner for lysine PTMs), using deep, bidirectional, long short-term memory recurrent neural networks for accurate and systematic mapping of eight major types of lysine PTMs in the human and mouse proteomes. Extensive benchmarking tests show that MUscADEL outperforms current methods for lysine PTM characterization, demonstrating the potential and power of deep learning techniques in protein PTM prediction. The web server of MUscADEL, together with all the data sets assembled in this study, is freely available at http://muscadel.erc.monash.edu/. We anticipate this comprehensive review and the application of deep learning will provide practical guide and useful insights into PTM prediction and inspire future bioinformatics studies in the related fields.

摘要

赖氨酸翻译后修饰（PTMs）在调节蛋白质的多种功能和生物过程中起着至关重要的作用。然而，由于基因组测序项目产生的测序数据量庞大，系统地鉴定整个蛋白质组中不同类型的赖氨酸 PTM 底物和 PTM 位点仍然是一个主要挑战。近年来，已经开发了许多用于赖氨酸 PTM 鉴定的计算方法。这些方法在其核心算法、提取的特征和特征选择技术以及评估策略方面表现出高度的多样性。因此，迫切需要重新审视这些方法并总结它们的方法学，以改进和进一步开发从大量序列数据中识别和表征赖氨酸 PTM 的计算技术。考虑到这一目标，我们首先对 49 种最新的赖氨酸 PTM 预测方法进行了全面调查。我们涵盖了对成功预测器的发展至关重要的各种重要方面，包括操作算法、序列和结构特征、特征选择、模型性能评估和软件实用性。我们进一步提出了提高模型性能的潜在策略。其次，为了检验使用深度学习进行赖氨酸 PTM 预测的可行性，我们提出了一种新的计算框架，称为 MUscADEL（用于赖氨酸 PTM 的多可扩展准确深度学习器），使用深度、双向、长短时记忆递归神经网络对人类和小鼠蛋白质组中的八种主要类型的赖氨酸 PTM 进行准确和系统的映射。广泛的基准测试表明，MUscADEL 在赖氨酸 PTM 特征描述方面优于当前的方法，证明了深度学习技术在蛋白质 PTM 预测中的潜力和优势。MUscADEL 的网络服务器以及本研究中组装的所有数据集均可在 http://muscadel.erc.monash.edu/ 免费获得。我们预计，这项全面的综述和深度学习的应用将为 PTM 预测提供实用指南和有用的见解，并激发相关领域的未来生物信息学研究。

相似文献

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

Brief Bioinform. 2019 Nov 27;20(6):2267-2290. doi: 10.1093/bib/bby089.

Systematic Characterization of Lysine Post-translational Modification Sites Using MUscADEL.

Methods Mol Biol. 2022;2499:205-219. doi: 10.1007/978-1-0716-2317-6_11.

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.

Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112.

PTM-ssMP: A Web Server for Predicting Different Types of Post-translational Modification Sites Using Novel Site-specific Modification Profile.

Int J Biol Sci. 2018 May 22;14(8):946-956. doi: 10.7150/ijbs.24121. eCollection 2018.

Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.

Brief Bioinform. 2019 Nov 27;20(6):2185-2199. doi: 10.1093/bib/bby079.

nhKcr: a new bioinformatics tool for predicting crotonylation sites on human nonhistone proteins based on deep learning.

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab146.

Adapt-Kcr: a novel deep learning framework for accurate prediction of lysine crotonylation sites based on learning embedding features and attention architecture.

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac037.

Computational identification of multiple lysine PTM sites by analyzing the instance hardness and feature importance.

Sci Rep. 2021 Sep 23;11(1):18882. doi: 10.1038/s41598-021-98458-y.

Current computational tools for protein lysine acylation site prediction.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae469.

A comprehensive review of the imbalance classification of protein post-translational modifications.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab089.

引用本文的文献

Integrating Redox Proteomics and Computational Modeling to Decipher Thiol-Based Oxidative Post-Translational Modifications (oxiPTMs) in Plant Stress Physiology.

Int J Mol Sci. 2025 Jul 18;26(14):6925. doi: 10.3390/ijms26146925.

MDDeep-Ace: species-specific acetylation site prediction based on multi-domain adaptation.

PeerJ. 2025 Jul 3;13:e19649. doi: 10.7717/peerj.19649. eCollection 2025.

MTPrompt-PTM: A Multi-Task Method for Post-Translational Modification Prediction Using Prompt Tuning on a Structure-Aware Protein Language Model.

Biomolecules. 2025 Jun 9;15(6):843. doi: 10.3390/biom15060843.

A Systematic Study of Lysine Succinylation in the Pathogenic Bacterium in Aquatic Animals.

Molecules. 2025 May 31;30(11):2418. doi: 10.3390/molecules30112418.

MlyPredCSED: based on extreme point deviation compensated clustering combined with cross-scale convolutional neural networks to predict multiple lysine sites in human.

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf189.

Dynamic Mapping of the Methylproteome Using a Chemoenzymatic Approach.

J Am Chem Soc. 2025 Mar 5;147(9):7214-7230. doi: 10.1021/jacs.4c08175. Epub 2025 Feb 25.

AlzGenPred - CatBoost-based gene classifier for predicting Alzheimer's disease using high-throughput sequencing data.

Sci Rep. 2024 Dec 5;14(1):30294. doi: 10.1038/s41598-024-82208-x.

DeepO-GlcNAc: a web server for prediction of protein O-GlcNAcylation sites using deep learning combined with attention mechanism.

Front Cell Dev Biol. 2024 Oct 10;12:1456728. doi: 10.3389/fcell.2024.1456728. eCollection 2024.

Current computational tools for protein lysine acylation site prediction.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae469.

RMTLysPTM: recognizing multiple types of lysine PTM sites by deep analysis on sequences.

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad450.

本文引用的文献

Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome.

Bioinformatics. 2018 Dec 15;34(24):4223-4231. doi: 10.1093/bioinformatics/bty522.

Site-specific characterization of endogenous SUMOylation across species and organs.

Nat Commun. 2018 Jun 25;9(1):2456. doi: 10.1038/s41467-018-04957-4.

iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC.

Bioinformatics. 2018 Dec 15;34(24):4196-4204. doi: 10.1093/bioinformatics/bty508.

Identifying RNA N-Methyladenosine Sites in Genome.

Front Microbiol. 2018 May 14;9:955. doi: 10.3389/fmicb.2018.00955. eCollection 2018.

iRNA-3typeA: Identifying Three Types of Modification at RNA's Adenosine Sites.

Mol Ther Nucleic Acids. 2018 Jun 1;11:468-474. doi: 10.1016/j.omtn.2018.03.012. Epub 2018 Mar 30.

Prediction of the antimicrobial activity of walnut (Juglans regia L.) kernel aqueous extracts using artificial neural network and multiple linear regression.

J Microbiol Methods. 2018 May;148:78-86. doi: 10.1016/j.mimet.2018.04.003. Epub 2018 Apr 9.

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences.

Bioinformatics. 2018 Jul 15;34(14):2499-2502. doi: 10.1093/bioinformatics/bty140.

Site-Specific Systematic Analysis of Lysine Modification Crosstalk.

Proteomics. 2018 May;18(9):e1700292. doi: 10.1002/pmic.201700292. Epub 2018 Apr 16.

UniProt: the universal protein knowledgebase.

Nucleic Acids Res. 2018 Mar 16;46(5):2699. doi: 10.1093/nar/gky092.

Quantitative Toxicity Prediction Using Topology Based Multitask Deep Neural Networks.

J Chem Inf Model. 2018 Feb 26;58(2):520-531. doi: 10.1021/acs.jcim.7b00558. Epub 2018 Jan 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献