• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PLM-T3SE:利用蛋白质语言模型嵌入技术准确预测III型分泌效应蛋白

PLM-T3SE: Accurate Prediction of Type III Secretion Effectors Using Protein Language Model Embeddings.

作者信息

Gao Mengru, Song Chen, Liu Taigang

机构信息

College of Information Technology, Shanghai Ocean University, Shanghai, China.

出版信息

J Cell Biochem. 2025 Jan;126(1):e30642. doi: 10.1002/jcb.30642. Epub 2024 Aug 20.

DOI:10.1002/jcb.30642
PMID:39164870
Abstract

The Type III secretion effectors (T3SEs) are bacterial proteins synthesized by Gram-negative pathogens and delivered into host cells via the Type III secretion system (T3SS). These effectors usually play a pivotal role in the interactions between bacteria and hosts. Hence, the precise identification of T3SEs aids researchers in exploring the pathogenic mechanisms of bacterial infections. Since the diversity and complexity of T3SE sequences often make traditional experimental methods time-consuming, it is imperative to explore more efficient and convenient computational approaches for T3SE prediction. Inspired by the promising potential exhibited by pre-trained language models in protein recognition tasks, we proposed a method called PLM-T3SE that utilizes protein language models (PLMs) for effective recognition of T3SEs. First, we utilized PLM embeddings and evolutionary features from the position-specific scoring matrix (PSSM) profiles to transform protein sequences into fixed-length vectors for model training. Second, we employed the extreme gradient boosting (XGBoost) algorithm to rank these features based on their importance. Finally, a MLP neural network model was used to predict T3SEs based on the selected optimal feature set. Experimental results from the cross-validation and independent test demonstrated that our model exhibited superior performance compared to the existing models. Specifically, our model achieved an accuracy of 98.1%, which is 1.8%-42.4% higher than the state-of-the-art predictors based on the same independent data set test. These findings highlight the superiority of the PLM-T3SE and the remarkable characterization ability of PLM embeddings for T3SE prediction.

摘要

III型分泌效应蛋白(T3SEs)是革兰氏阴性病原体合成的细菌蛋白,通过III型分泌系统(T3SS)传递到宿主细胞中。这些效应蛋白通常在细菌与宿主的相互作用中起关键作用。因此,准确鉴定T3SEs有助于研究人员探索细菌感染的致病机制。由于T3SE序列的多样性和复杂性常常使传统实验方法耗时费力,因此有必要探索更高效便捷的计算方法来预测T3SEs。受预训练语言模型在蛋白质识别任务中展现出的巨大潜力启发,我们提出了一种名为PLM-T3SE的方法,该方法利用蛋白质语言模型(PLMs)来有效识别T3SEs。首先,我们利用PLM嵌入和来自位置特异性得分矩阵(PSSM)谱的进化特征,将蛋白质序列转化为固定长度的向量用于模型训练。其次,我们采用极端梯度提升(XGBoost)算法根据特征的重要性对其进行排序。最后,使用多层感知器(MLP)神经网络模型基于选定的最优特征集来预测T3SEs。交叉验证和独立测试的实验结果表明,与现有模型相比,我们的模型表现出更优的性能。具体而言,我们的模型准确率达到了98.1%,比基于相同独立数据集测试的最先进预测器高出1.8%-42.4%。这些发现突出了PLM-T3SE的优越性以及PLM嵌入在T3SE预测方面卓越的表征能力。

相似文献

1
PLM-T3SE: Accurate Prediction of Type III Secretion Effectors Using Protein Language Model Embeddings.PLM-T3SE:利用蛋白质语言模型嵌入技术准确预测III型分泌效应蛋白
J Cell Biochem. 2025 Jan;126(1):e30642. doi: 10.1002/jcb.30642. Epub 2024 Aug 20.
2
ACNNT3: Attention-CNN Framework for Prediction of Sequence-Based Bacterial Type III Secreted Effectors.ACNNT3:基于序列的细菌 III 型分泌效应子预测的注意力-CNN 框架。
Comput Math Methods Med. 2020 Apr 3;2020:3974598. doi: 10.1155/2020/3974598. eCollection 2020.
3
Characterizing Secretion System Effector Proteins With Structure-Aware Graph Neural Networks and Pre-Trained Language Models.基于结构感知图神经网络和预训练语言模型的分泌系统效应蛋白特性分析。
IEEE J Biomed Health Inform. 2024 Sep;28(9):5649-5657. doi: 10.1109/JBHI.2024.3413146. Epub 2024 Sep 5.
4
iT3SE-PX: Identification of Bacterial Type III Secreted Effectors Using PSSM Profiles and XGBoost Feature Selection.iT3SE-PX:使用 PSSM 特征和 XGBoost 特征选择鉴定细菌 III 型分泌效应子。
Comput Math Methods Med. 2021 Jan 6;2021:6690299. doi: 10.1155/2021/6690299. eCollection 2021.
5
Bastion3: a two-layer ensemble predictor of type III secreted effectors.堡垒 3:III 型分泌效应物的双层集成预测器。
Bioinformatics. 2019 Jun 1;35(12):2017-2028. doi: 10.1093/bioinformatics/bty914.
6
T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.T3SEpp:一种用于细菌III型分泌效应蛋白的综合预测流程
mSystems. 2020 Aug 4;5(4):e00288-20. doi: 10.1128/mSystems.00288-20.
7
DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence.DeepT3:使用 N 端序列,深度卷积神经网络准确识别革兰氏阴性菌 III 型分泌效应物。
Bioinformatics. 2019 Jun 1;35(12):2051-2057. doi: 10.1093/bioinformatics/bty931.
8
EP3: an ensemble predictor that accurately identifies type III secreted effectors.EP3:一种能够准确识别 III 型分泌效应物的集成预测器。
Brief Bioinform. 2021 Mar 22;22(2):1918-1928. doi: 10.1093/bib/bbaa008.
9
Identifying Type III Secreted Effector Function via a Yeast Genomic Screen.通过酵母基因组筛选鉴定III型分泌效应子功能
G3 (Bethesda). 2019 Feb 7;9(2):535-547. doi: 10.1534/g3.118.200877.
10
PLM-ATG: Identification of Autophagy Proteins by Integrating Protein Language Model Embeddings with PSSM-Based Features.PLM-ATG:通过将蛋白质语言模型嵌入与基于位置特异性得分矩阵的特征相结合来鉴定自噬蛋白
Molecules. 2025 Apr 10;30(8):1704. doi: 10.3390/molecules30081704.

引用本文的文献

1
Exo-Tox: Identifying Exotoxins from secreted bacterial proteins.外毒素:从分泌的细菌蛋白中鉴定外毒素
BioData Min. 2025 Aug 8;18(1):52. doi: 10.1186/s13040-025-00469-2.