Suppr超能文献

当前用于预测蛋白质赖氨酸酰化位点的计算工具。

Current computational tools for protein lysine acylation site prediction.

机构信息

Collaborative Innovation Center of Henan Grain Crops, Henan Key Laboratory of Rice Molecular Breeding and High Efficiency Production, College of Agronomy, Henan Agricultural University, Zhengzhou 450046, China.

State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae469.

Abstract

As a main subtype of post-translational modification (PTM), protein lysine acylations (PLAs) play crucial roles in regulating diverse functions of proteins. With recent advancements in proteomics technology, the identification of PTM is becoming a data-rich field. A large amount of experimentally verified data is urgently required to be translated into valuable biological insights. With computational approaches, PLA can be accurately detected across the whole proteome, even for organisms with small-scale datasets. Herein, a comprehensive summary of 166 in silico PLA prediction methods is presented, including a single type of PLA site and multiple types of PLA sites. This recapitulation covers important aspects that are critical for the development of a robust predictor, including data collection and preparation, sample selection, feature representation, classification algorithm design, model evaluation, and method availability. Notably, we discuss the application of protein language models and transfer learning to solve the small-sample learning issue. We also highlight the prediction methods developed for functionally relevant PLA sites and species/substrate/cell-type-specific PLA sites. In conclusion, this systematic review could potentially facilitate the development of novel PLA predictors and offer useful insights to researchers from various disciplines.

摘要

作为翻译后修饰(PTM)的主要亚型之一,蛋白质赖氨酸酰化(PLA)在调节蛋白质的多种功能方面发挥着关键作用。随着蛋白质组学技术的最新进展,PTM 的鉴定正成为一个数据丰富的领域。迫切需要将大量经过实验验证的数据转化为有价值的生物学见解。通过计算方法,可以在整个蛋白质组中准确检测 PLA,即使对于数据集规模较小的生物体也是如此。本文全面总结了 166 种基于计算的 PLA 预测方法,包括单一类型的 PLA 位点和多种类型的 PLA 位点。这一综述涵盖了开发稳健预测器的关键方面,包括数据收集和准备、样本选择、特征表示、分类算法设计、模型评估和方法可用性。值得注意的是,我们讨论了蛋白质语言模型和迁移学习在解决小样本学习问题中的应用。我们还强调了针对功能相关 PLA 位点和物种/底物/细胞类型特异性 PLA 位点开发的预测方法。总之,本系统综述可能有助于开发新的 PLA 预测器,并为来自不同学科的研究人员提供有用的见解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验