Suppr超能文献

用于预测蛋白质泛素化位点的多模态深度学习

Multimodal deep learning for predicting protein ubiquitination sites.

作者信息

Pakhrin Subash C, Beck Moriah R, Subedi Punjan, Lama Rabina, Shrestha Simonsha

机构信息

School of Computing, Wichita State University, Wichita, KS 67260, United States.

Department of Computer Science and Engineering Technology, University of Houston-Downtown, Houston, TX 77002, United States.

出版信息

Bioinform Adv. 2025 Aug 20;5(1):vbaf200. doi: 10.1093/bioadv/vbaf200. eCollection 2025.

Abstract

MOTIVATION

Ubiquitination is a crucial post-translational modification that regulates various biological functions, including protein degradation, signal transduction, and cellular homeostasis. Accurate identification of ubiquitination sites is essential for understanding these mechanisms, yet existing prediction tools often lack generalizability across diverse datasets. To address this limitation, we developed Multimodal Ubiquitination Predictor, a deep learning-based approach capable of predicting ubiquitination sites across general, human-specific, and plant-specific datasets. By integrating diverse protein sequence representations-one-hot encoding, embeddings, and physicochemical properties-within a unified deep-learning framework, the proposed method significantly enhances prediction accuracy and robustness, offering a valuable resource for both research and applications in ubiquitination site discovery.

RESULTS

Multimodal Ubiquitination Predictor achieved superior performance across general, human-specific, and plant-specific datasets, with 77.25% accuracy, 74.98% sensitivity, 80.67% specificity, an MCC of 0.54, and an AUC of 0.87 on an independent human ubiquitination test dataset. It outperformed existing methods, demonstrating enhanced reliability for ubiquitination site prediction. This robust predictor and dataset serve as valuable resources for future research and discovery.

AVAILABILITY AND IMPLEMENTATION

The developed tool, programs, training, and test dataset are available at https://github.com/PakhrinLab/MMUbiPred.

摘要

动机

泛素化是一种关键的翻译后修饰,可调节各种生物学功能,包括蛋白质降解、信号转导和细胞稳态。准确识别泛素化位点对于理解这些机制至关重要,但现有的预测工具在不同数据集上往往缺乏通用性。为了解决这一局限性,我们开发了多模态泛素化预测器,这是一种基于深度学习的方法,能够预测通用、人类特异性和植物特异性数据集中的泛素化位点。通过在统一的深度学习框架中整合多种蛋白质序列表示形式——独热编码、嵌入和物理化学性质,该方法显著提高了预测准确性和鲁棒性,为泛素化位点发现的研究和应用提供了宝贵资源。

结果

多模态泛素化预测器在通用、人类特异性和植物特异性数据集上均取得了优异的性能,在独立的人类泛素化测试数据集上,准确率为77.25%,灵敏度为74.98%,特异性为80.67%,马修斯相关系数为0.54,曲线下面积为0.87。它优于现有方法,证明了在泛素化位点预测方面具有更高的可靠性。这种强大的预测器和数据集为未来的研究和发现提供了宝贵资源。

可用性和实现方式

所开发的工具、程序、训练和测试数据集可在https://github.com/PakhrinLab/MMUbiPred上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7937/12408473/44003774cf00/vbaf200f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验