Zhang Jian, Qian Jingjing, Wang Pei, Liu Xuan, Zhang Fuhao, Chai Haiting, Zou Quan
School of Computer and Information Technology, Xinyang Normal University, Xinyang, 464000, China.
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324003, China.
Adv Sci (Weinh). 2025 Jun;12(23):e2500581. doi: 10.1002/advs.202500581. Epub 2025 Mar 27.
Protein carbonylation refers to the covalent modification of proteins through the attachment of carbonyl groups, which arise from oxidative stress. This modification is biologically significant, as it can elicit modifications in protein functionality, signaling cascades, and cellular homeostasis. Accurate prediction of carbonylation sites offers valuable insights into the mechanisms underlying protein carbonylation and the pathogenesis of related diseases. Notably, carbonylation sites and ligand interaction sites, both functional sites, exhibit numerous similarities. The survey reveals that current computation-based approaches tend to make excessive cross-predictions for ligand interaction sites. To tackle this unresolved challenge, selective carbonylation sites (SCANS) is introduced, a novel deep learning-based framework. SCANS employs a multilevel attention strategy to capture both local (segment-level) and global (protein-level) features, utilizes a tailored loss function to penalize cross-predictions (residue-level), and applies transfer learning to augment the specificity of the overall network by leveraging knowledge from pretrained model. These innovative designs have been shown to successfully boost predictive performance and statistically outperforms current methods. Particularly, results on benchmark testing dataset demonstrate that SCANS consistently achieves low false positive rates, including low rates of cross-predictions. Furthermore, motif analyses and interpretations are conducted to provide novel insights into the protein carbonylation sites from various perspectives.
蛋白质羰基化是指蛋白质通过连接羰基而发生的共价修饰,羰基源于氧化应激。这种修饰具有生物学意义,因为它会引发蛋白质功能、信号级联反应和细胞稳态的改变。准确预测羰基化位点有助于深入了解蛋白质羰基化的潜在机制以及相关疾病的发病机制。值得注意的是,羰基化位点和配体相互作用位点这两个功能位点有许多相似之处。调查显示目前基于计算的方法往往对配体相互作用位点进行过多的交叉预测。为应对这一未解决的挑战,引入了选择性羰基化位点(SCANS),这是一个基于深度学习的新颖框架。SCANS采用多级注意力策略来捕捉局部(片段级)和全局(蛋白质级)特征,利用定制的损失函数来惩罚交叉预测(残基级),并应用迁移学习通过利用预训练模型的知识来增强整个网络的特异性。这些创新设计已被证明能成功提高预测性能,并且在统计上优于当前方法。特别是,在基准测试数据集上的结果表明,SCANS始终保持低误报率,包括低交叉预测率。此外,还进行了基序分析和解读,以便从不同角度为蛋白质羰基化位点提供新的见解。