• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于模拟III型效应蛋白分泌信号的自然语言处理方法。

Natural language processing approach to model the secretion signal of type III effectors.

作者信息

Wagner Naama, Alburquerque Michael, Ecker Noa, Dotan Edo, Zerah Ben, Pena Michelle Mendonca, Potnis Neha, Pupko Tal

机构信息

The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.

Department of Entomology and Plant Pathology, Auburn University, Auburn, AL, United States.

出版信息

Front Plant Sci. 2022 Oct 31;13:1024405. doi: 10.3389/fpls.2022.1024405. eCollection 2022.

DOI:10.3389/fpls.2022.1024405
PMID:36388586
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9659976/
Abstract

Type III effectors are proteins injected by Gram-negative bacteria into eukaryotic hosts. In many plant and animal pathogens, these effectors manipulate host cellular processes to the benefit of the bacteria. Type III effectors are secreted by a type III secretion system that must "classify" each bacterial protein into one of two categories, either the protein should be translocated or not. It was previously shown that type III effectors have a secretion signal within their N-terminus, however, despite numerous efforts, the exact biochemical identity of this secretion signal is generally unknown. Computational characterization of the secretion signal is important for the identification of novel effectors and for better understanding the molecular translocation mechanism. In this work we developed novel machine-learning algorithms for characterizing the secretion signal in both plant and animal pathogens. Specifically, we represented each protein as a vector in high-dimensional space using Facebook's protein language model. Classification algorithms were next used to separate effectors from non-effector proteins. We subsequently curated a benchmark dataset of hundreds of effectors and thousands of non-effector proteins. We showed that on this curated dataset, our novel approach yielded substantially better classification accuracy compared to previously developed methodologies. We have also tested the hypothesis that plant and animal pathogen effectors are characterized by different secretion signals. Finally, we integrated the novel approach in Effectidor, a web-server for predicting type III effector proteins, leading to a more accurate classification of effectors from non-effectors.

摘要

III型效应蛋白是革兰氏阴性细菌注入真核宿主细胞的蛋白质。在许多动植物病原体中,这些效应蛋白会操纵宿主细胞过程,从而有利于细菌。III型效应蛋白由III型分泌系统分泌,该系统必须将每种细菌蛋白“分类”为两类中的一类,即该蛋白是否应该被转运。先前的研究表明,III型效应蛋白在其N端具有分泌信号,然而,尽管进行了大量研究,但这种分泌信号的确切生化特性通常仍不清楚。对分泌信号进行计算表征对于识别新型效应蛋白以及更好地理解分子转运机制非常重要。在这项工作中,我们开发了新颖的机器学习算法来表征动植物病原体中的分泌信号。具体来说,我们使用Facebook的蛋白质语言模型将每种蛋白质表示为高维空间中的向量。接下来使用分类算法将效应蛋白与非效应蛋白区分开来。随后,我们精心策划了一个包含数百种效应蛋白和数千种非效应蛋白的基准数据集。我们表明,在这个精心策划的数据集上,与先前开发的方法相比,我们的新方法产生了更高的分类准确率。我们还测试了动植物病原体效应蛋白具有不同分泌信号这一假设。最后,我们将这种新方法集成到Effectidor中,这是一个用于预测III型效应蛋白的网络服务器,从而实现了效应蛋白与非效应蛋白之间更准确的分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/8a8bdccc8b0c/fpls-13-1024405-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/11b669449204/fpls-13-1024405-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/ae82bd9016df/fpls-13-1024405-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/2399b0f4bffc/fpls-13-1024405-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/1da137a957e4/fpls-13-1024405-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/8a8bdccc8b0c/fpls-13-1024405-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/11b669449204/fpls-13-1024405-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/ae82bd9016df/fpls-13-1024405-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/2399b0f4bffc/fpls-13-1024405-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/1da137a957e4/fpls-13-1024405-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd1d/9659976/8a8bdccc8b0c/fpls-13-1024405-g005.jpg

相似文献

1
Natural language processing approach to model the secretion signal of type III effectors.用于模拟III型效应蛋白分泌信号的自然语言处理方法。
Front Plant Sci. 2022 Oct 31;13:1024405. doi: 10.3389/fpls.2022.1024405. eCollection 2022.
2
Predicting Type III Effector Proteins Using the Effectidor Web Server.使用 Effectidor Web 服务器预测 III 型效应蛋白。
Methods Mol Biol. 2022;2427:25-36. doi: 10.1007/978-1-0716-1971-1_3.
3
Effectidor: an automated machine-learning-based web server for the prediction of type-III secretion system effectors.效应物预测器:一种基于自动化机器学习的网络服务器,用于预测 III 型分泌系统效应物。
Bioinformatics. 2022 Apr 12;38(8):2341-2343. doi: 10.1093/bioinformatics/btac087.
4
Computational prediction of type III secreted proteins from gram-negative bacteria.计算预测革兰氏阴性菌的 III 型分泌蛋白。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S47. doi: 10.1186/1471-2105-11-S1-S47.
5
Computational approach to predict species-specific type III secretion system (T3SS) effectors using single and multiple genomes.利用单基因组和多基因组预测物种特异性III型分泌系统(T3SS)效应蛋白的计算方法。
BMC Genomics. 2016 Dec 19;17(1):1048. doi: 10.1186/s12864-016-3363-1.
6
Accurate prediction of secreted substrates and identification of a conserved putative secretion signal for type III secretion systems.准确预测分泌底物并鉴定III型分泌系统保守的假定分泌信号。
PLoS Pathog. 2009 Apr;5(4):e1000375. doi: 10.1371/journal.ppat.1000375. Epub 2009 Apr 24.
7
Genome-scale identification of Legionella pneumophila effectors using a machine learning approach.使用机器学习方法对嗜肺军团菌效应蛋白进行全基因组规模鉴定。
PLoS Pathog. 2009 Jul;5(7):e1000508. doi: 10.1371/journal.ppat.1000508. Epub 2009 Jul 10.
8
Features and algorithms: facilitating investigation of secreted effectors in Gram-negative bacteria.特点和算法:促进革兰氏阴性菌中分泌效应物的研究。
Trends Microbiol. 2023 Nov;31(11):1162-1178. doi: 10.1016/j.tim.2023.05.011. Epub 2023 Jun 20.
9
Sequence-based prediction of type III secreted proteins.基于序列的III型分泌蛋白预测。
PLoS Pathog. 2009 Apr;5(4):e1000376. doi: 10.1371/journal.ppat.1000376. Epub 2009 Apr 24.
10
A new means to identify type 3 secreted effectors: functionally interchangeable class IB chaperones recognize a conserved sequence.一种鉴定 III 型分泌效应子的新方法:功能可互换的 I 类分子伴侣识别保守序列。
mBio. 2012 Feb 14;3(1). doi: 10.1128/mBio.00243-11. Print 2012.

引用本文的文献

1
Effectidor II: a pan-genomic AI-based algorithm for the prediction of type III secretion system effectors.Effectidor II:一种基于全基因组人工智能的III型分泌系统效应蛋白预测算法。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf272.
2
Contrastive-learning of language embedding and biological features for cross modality encoding and effector prediction.用于跨模态编码和效应器预测的语言嵌入与生物学特征的对比学习。
Nat Commun. 2025 Feb 3;16(1):1299. doi: 10.1038/s41467-025-56526-1.
3
Effect of tokenization on transformers for biological sequences.

本文引用的文献

1
Effectidor: an automated machine-learning-based web server for the prediction of type-III secretion system effectors.效应物预测器:一种基于自动化机器学习的网络服务器,用于预测 III 型分泌系统效应物。
Bioinformatics. 2022 Apr 12;38(8):2341-2343. doi: 10.1093/bioinformatics/btac087.
2
Simple and Rapid Assembly of TALE Modules Based on the Degeneracy of the Codons and Trimer Repeats.基于密码子和三核苷酸重复的简并性的 TALE 模块的快速组装。
Genes (Basel). 2021 Nov 5;12(11):1761. doi: 10.3390/genes12111761.
3
DeepT3 2.0: improving type III secreted effector predictions by an integrative deep learning framework.
词元化对生物序列变压器模型的影响。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae196.
4
T4SEpp: A pipeline integrating protein language models to predict bacterial type IV secreted effectors.T4SEpp:一种整合蛋白质语言模型以预测细菌IV型分泌效应蛋白的流程。
Comput Struct Biotechnol J. 2024 Jan 23;23:801-812. doi: 10.1016/j.csbj.2024.01.015. eCollection 2024 Dec.
5
Complete genome sequence of an Israeli isolate of pv. pelargonii strain 305 and novel type III effectors identified in .以色列天竺葵致病变种菌株305的一个分离株的全基因组序列以及在其中鉴定出的新型III型效应子 。
Front Plant Sci. 2023 Jun 2;14:1155341. doi: 10.3389/fpls.2023.1155341. eCollection 2023.
DeepT3 2.0:通过集成深度学习框架改进III型分泌效应蛋白预测
NAR Genom Bioinform. 2021 Oct 4;3(4):lqab086. doi: 10.1093/nargab/lqab086. eCollection 2021 Dec.
4
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.
5
Type III secretion system effectors form robust and flexible intracellular virulence networks.III 型分泌系统效应子形成了强大而灵活的细胞内毒力网络。
Science. 2021 Mar 12;371(6534). doi: 10.1126/science.abc9531.
6
DeepT3_4: A Hybrid Deep Neural Network Model for the Distinction Between Bacterial Type III and IV Secreted Effectors.DeepT3_4:一种用于区分细菌III型和IV型分泌效应蛋白的混合深度神经网络模型
Front Microbiol. 2021 Jan 21;12:605782. doi: 10.3389/fmicb.2021.605782. eCollection 2021.
7
iT3SE-PX: Identification of Bacterial Type III Secreted Effectors Using PSSM Profiles and XGBoost Feature Selection.iT3SE-PX:使用 PSSM 特征和 XGBoost 特征选择鉴定细菌 III 型分泌效应子。
Comput Math Methods Med. 2021 Jan 6;2021:6690299. doi: 10.1155/2021/6690299. eCollection 2021.
8
RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation.RefSeq:通过蛋白质家族模型编纂扩展原核生物基因组注释管道的覆盖范围。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1020-D1028. doi: 10.1093/nar/gkaa1105.
9
T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.T3SEpp:一种用于细菌III型分泌效应蛋白的综合预测流程
mSystems. 2020 Aug 4;5(4):e00288-20. doi: 10.1128/mSystems.00288-20.
10
PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data.PSORTm:一种用于宏基因组数据的细菌和古菌蛋白质亚细胞定位预测工具。
Bioinformatics. 2020 May 1;36(10):3043-3048. doi: 10.1093/bioinformatics/btaa136.