• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

T3SEpp:一种用于细菌III型分泌效应蛋白的综合预测流程

T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.

作者信息

Hui Xinjie, Chen Zewei, Lin Mingxiong, Zhang Junya, Hu Yueming, Zeng Yingying, Cheng Xi, Ou-Yang Le, Sun Ming-An, White Aaron P, Wang Yejun

机构信息

Department of Cell Biology and Genetics, School of Basic Medicine, Shenzhen University Health Science, Shenzhen, China.

Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, College of Information Engineering, Shenzhen University, Shenzhen, China.

出版信息

mSystems. 2020 Aug 4;5(4):e00288-20. doi: 10.1128/mSystems.00288-20.

DOI:10.1128/mSystems.00288-20
PMID:32753503
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7406222/
Abstract

Many Gram-negative bacteria infect hosts and cause diseases by translocating a variety of type III secreted effectors (T3SEs) into the host cell cytoplasm. However, despite a dramatic increase in the number of available whole-genome sequences, it remains challenging for accurate prediction of T3SEs. Traditional prediction models have focused on atypical sequence features buried in the N-terminal peptides of T3SEs, but unfortunately, these models have had high false-positive rates. In this research, we integrated promoter information along with characteristic protein features for signal regions, chaperone-binding domains, and effector domains for T3SE prediction. Machine learning algorithms, including deep learning, were adopted to predict the atypical features mainly buried in signal sequences of T3SEs, followed by development of a voting-based ensemble model integrating the individual prediction results. We assembled this into a unified T3SE prediction pipeline, T3SEpp, which integrated the results of individual modules, resulting in high accuracy (i.e., ∼0.94) and >1-fold reduction in the false-positive rate compared to that of state-of-the-art software tools. The T3SEpp pipeline and sequence features observed here will facilitate the accurate identification of new T3SEs, with numerous benefits for future studies on host-pathogen interactions. Type III secreted effector (T3SE) prediction remains a big computational challenge. In practical applications, current software tools often suffer problems of high false-positive rates. One of the causal factors could be the relatively unitary type of biological features used for the design and training of the models. In this research, we made a comprehensive survey on the sequence-based features of T3SEs, including signal sequences, chaperone-binding domains, effector domains, and transcription factor binding promoter sites, and assembled a unified prediction pipeline integrating multi-aspect biological features within homology-based and multiple machine learning models. To our knowledge, we have compiled the most comprehensive biological sequence feature analysis for T3SEs in this research. The T3SEpp pipeline integrating the variety of features and assembling different models showed high accuracy, which should facilitate more accurate identification of T3SEs in new and existing bacterial whole-genome sequences.

摘要

许多革兰氏阴性菌通过将多种III型分泌效应蛋白(T3SEs)转运到宿主细胞质中来感染宿主并引发疾病。然而,尽管可用的全基因组序列数量大幅增加,但准确预测T3SEs仍然具有挑战性。传统的预测模型侧重于T3SEs N端肽中隐藏的非典型序列特征,但不幸的是,这些模型的假阳性率很高。在本研究中,我们整合了启动子信息以及信号区、伴侣结合结构域和效应结构域的特征蛋白特征,用于T3SE预测。采用包括深度学习在内的机器学习算法来预测主要隐藏在T3SEs信号序列中的非典型特征,随后开发了一种基于投票的集成模型,将各个预测结果整合在一起。我们将其组装成一个统一的T3SE预测管道T3SEpp,该管道整合了各个模块的结果,与最先进的软件工具相比,具有较高的准确性(即~0.94)且假阳性率降低了1倍以上。这里观察到的T3SEpp管道和序列特征将有助于准确鉴定新的T3SEs,对未来宿主-病原体相互作用的研究有诸多益处。III型分泌效应蛋白(T3SE)预测仍然是一个巨大的计算挑战。在实际应用中,当前的软件工具常常存在假阳性率高的问题。其中一个原因可能是用于模型设计和训练的生物特征类型相对单一。在本研究中,我们对T3SEs基于序列的特征进行了全面调查,包括信号序列、伴侣结合结构域、效应结构域和转录因子结合启动子位点,并在基于同源性和多种机器学习模型中组装了一个整合多方面生物特征的统一预测管道。据我们所知,我们在本研究中对T3SEs进行了最全面的生物序列特征分析。整合多种特征并组装不同模型的T3SEpp管道显示出较高的准确性,这应该有助于在新的和现有的细菌全基因组序列中更准确地鉴定T3SEs。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/975f774400dc/mSystems.00288-20-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/5dc5f2922522/mSystems.00288-20-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/052861e72022/mSystems.00288-20-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/6e06b5e205bd/mSystems.00288-20-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/eebdaa926bf3/mSystems.00288-20-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/2d0fca21d76d/mSystems.00288-20-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/32c59557e539/mSystems.00288-20-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/975f774400dc/mSystems.00288-20-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/5dc5f2922522/mSystems.00288-20-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/052861e72022/mSystems.00288-20-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/6e06b5e205bd/mSystems.00288-20-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/eebdaa926bf3/mSystems.00288-20-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/2d0fca21d76d/mSystems.00288-20-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/32c59557e539/mSystems.00288-20-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d718/7406222/975f774400dc/mSystems.00288-20-f0007.jpg

相似文献

1
T3SEpp: an Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.T3SEpp:一种用于细菌III型分泌效应蛋白的综合预测流程
mSystems. 2020 Aug 4;5(4):e00288-20. doi: 10.1128/mSystems.00288-20.
2
Bastion3: a two-layer ensemble predictor of type III secreted effectors.堡垒 3:III 型分泌效应物的双层集成预测器。
Bioinformatics. 2019 Jun 1;35(12):2017-2028. doi: 10.1093/bioinformatics/bty914.
3
DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence.DeepT3:使用 N 端序列,深度卷积神经网络准确识别革兰氏阴性菌 III 型分泌效应物。
Bioinformatics. 2019 Jun 1;35(12):2051-2057. doi: 10.1093/bioinformatics/bty931.
4
ACNNT3: Attention-CNN Framework for Prediction of Sequence-Based Bacterial Type III Secreted Effectors.ACNNT3:基于序列的细菌 III 型分泌效应子预测的注意力-CNN 框架。
Comput Math Methods Med. 2020 Apr 3;2020:3974598. doi: 10.1155/2020/3974598. eCollection 2020.
5
DeepT3 2.0: improving type III secreted effector predictions by an integrative deep learning framework.DeepT3 2.0:通过集成深度学习框架改进III型分泌效应蛋白预测
NAR Genom Bioinform. 2021 Oct 4;3(4):lqab086. doi: 10.1093/nargab/lqab086. eCollection 2021 Dec.
6
BEAN 2.0: an integrated web resource for the identification and functional analysis of type III secreted effectors.BEAN 2.0:用于III型分泌效应子鉴定和功能分析的综合网络资源。
Database (Oxford). 2015 Jun 27;2015:bav064. doi: 10.1093/database/bav064. Print 2015.
7
PLM-T3SE: Accurate Prediction of Type III Secretion Effectors Using Protein Language Model Embeddings.PLM-T3SE:利用蛋白质语言模型嵌入技术准确预测III型分泌效应蛋白
J Cell Biochem. 2025 Jan;126(1):e30642. doi: 10.1002/jcb.30642. Epub 2024 Aug 20.
8
EP3: an ensemble predictor that accurately identifies type III secreted effectors.EP3:一种能够准确识别 III 型分泌效应物的集成预测器。
Brief Bioinform. 2021 Mar 22;22(2):1918-1928. doi: 10.1093/bib/bbaa008.
9
T3SEdb: data warehousing of virulence effectors secreted by the bacterial Type III Secretion System.T3SEdb:细菌 III 型分泌系统分泌的毒力效应子的数据仓库。
BMC Bioinformatics. 2010 Oct 15;11 Suppl 7(Suppl 7):S4. doi: 10.1186/1471-2105-11-S7-S4.
10
Computational prediction of type III secreted proteins from gram-negative bacteria.计算预测革兰氏阴性菌的 III 型分泌蛋白。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S47. doi: 10.1186/1471-2105-11-S1-S47.

引用本文的文献

1
Contrastive-learning of language embedding and biological features for cross modality encoding and effector prediction.用于跨模态编码和效应器预测的语言嵌入与生物学特征的对比学习。
Nat Commun. 2025 Feb 3;16(1):1299. doi: 10.1038/s41467-025-56526-1.
2
Genome Insights into Beneficial Microbial Strains Composing SIMBA Microbial Consortia Applied as Biofertilizers for Maize, Wheat and Tomato.用于玉米、小麦和番茄的生物肥料SIMBA微生物群落中有益微生物菌株的基因组洞察
Microorganisms. 2024 Dec 12;12(12):2562. doi: 10.3390/microorganisms12122562.
3
T4SEpp: A pipeline integrating protein language models to predict bacterial type IV secreted effectors.

本文引用的文献

1
SignalP 5.0 improves signal peptide predictions using deep neural networks.SignalP 5.0 使用深度神经网络改进了信号肽预测。
Nat Biotechnol. 2019 Apr;37(4):420-423. doi: 10.1038/s41587-019-0036-z. Epub 2019 Feb 18.
2
Export of a Vibrio parahaemolyticus toxin by the Sec and type III secretion machineries in tandem.串联的 Sec 和 III 型分泌机制对副溶血性弧菌毒素的输出。
Nat Microbiol. 2019 May;4(5):781-788. doi: 10.1038/s41564-019-0368-y. Epub 2019 Feb 18.
3
DeepT3: deep convolutional neural networks accurately identify Gram-negative bacterial type III secreted effectors using the N-terminal sequence.
T4SEpp:一种整合蛋白质语言模型以预测细菌IV型分泌效应蛋白的流程。
Comput Struct Biotechnol J. 2024 Jan 23;23:801-812. doi: 10.1016/j.csbj.2024.01.015. eCollection 2024 Dec.
4
DeepSecE: A Deep-Learning-Based Framework for Multiclass Prediction of Secreted Proteins in Gram-Negative Bacteria.DeepSecE:一种基于深度学习的革兰氏阴性菌分泌蛋白多类预测框架。
Research (Wash D C). 2023 Oct 25;6:0258. doi: 10.34133/research.0258. eCollection 2023.
5
Identification and characterization of opportunistic pathogen causing potato blackleg in China.中国引起马铃薯黑胫病的机会致病菌的鉴定与特性分析
Front Plant Sci. 2023 Mar 3;14:1097741. doi: 10.3389/fpls.2023.1097741. eCollection 2023.
6
Natural language processing approach to model the secretion signal of type III effectors.用于模拟III型效应蛋白分泌信号的自然语言处理方法。
Front Plant Sci. 2022 Oct 31;13:1024405. doi: 10.3389/fpls.2022.1024405. eCollection 2022.
7
Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics.蛋白质亚细胞定位预测及相关主题的最新进展
Front Bioinform. 2022 May 19;2:910531. doi: 10.3389/fbinf.2022.910531. eCollection 2022.
8
Chlamydia trachomatis Alters Mitochondrial Protein Composition and Secretes Effector Proteins That Target Mitochondria.沙眼衣原体改变线粒体蛋白组成并分泌靶向线粒体的效应蛋白。
mSphere. 2022 Dec 21;7(6):e0042322. doi: 10.1128/msphere.00423-22. Epub 2022 Oct 26.
9
T1SEstacker: A Tri-Layer Stacking Model Effectively Predicts Bacterial Type 1 Secreted Proteins Based on C-Terminal Non-repeats-in-Toxin-Motif Sequence Features.T1SEstacker:一种基于毒素基序序列特征中C端非重复序列的三层堆叠模型,可有效预测细菌1型分泌蛋白。
Front Microbiol. 2022 Feb 8;12:813094. doi: 10.3389/fmicb.2021.813094. eCollection 2021.
10
Genome Analysis of the sp. Strain SLB01 from the Diseased Sponge of the .从患病海绵中分离的. 菌株 SLB01 的基因组分析
Curr Issues Mol Biol. 2021 Dec 11;43(3):2220-2237. doi: 10.3390/cimb43030156.
DeepT3:使用 N 端序列,深度卷积神经网络准确识别革兰氏阴性菌 III 型分泌效应物。
Bioinformatics. 2019 Jun 1;35(12):2051-2057. doi: 10.1093/bioinformatics/bty931.
4
Bastion3: a two-layer ensemble predictor of type III secreted effectors.堡垒 3:III 型分泌效应物的双层集成预测器。
Bioinformatics. 2019 Jun 1;35(12):2017-2028. doi: 10.1093/bioinformatics/bty914.
5
The species-spanning family of LPX-motif harbouring effector proteins.富含 LPX 基序的效应蛋白家族,具有种间广泛分布的特征。
Cell Microbiol. 2018 Nov;20(11):e12945. doi: 10.1111/cmi.12945. Epub 2018 Sep 17.
6
Phylogenetic profiling, an untapped resource for the prediction of secreted proteins and its complementation with sequence-based classifiers in bacterial type III, IV and VI secretion systems.系统发育轮廓分析:一种预测细菌 III、IV 和 VI 型分泌系统中分泌蛋白的未开发资源,及其与基于序列分类器的互补分析。
Brief Bioinform. 2019 Jul 19;20(4):1395-1402. doi: 10.1093/bib/bby009.
7
DeepLoc: prediction of protein subcellular localization using deep learning.DeepLoc:使用深度学习进行蛋白质亚细胞定位预测。
Bioinformatics. 2017 Nov 1;33(21):3387-3395. doi: 10.1093/bioinformatics/btx431.
8
An account of in silico identification tools of secreted effector proteins in bacteria and future challenges.细菌中分泌效应蛋白的计算鉴定工具及其未来挑战的概述。
Brief Bioinform. 2019 Jan 18;20(1):110-129. doi: 10.1093/bib/bbx078.
9
Salmonella SPI-2 Type III Secretion System Effectors: Molecular Mechanisms And Physiological Consequences.沙门氏菌 SPI-2 型 III 型分泌系统效应器:分子机制和生理后果。
Cell Host Microbe. 2017 Aug 9;22(2):217-231. doi: 10.1016/j.chom.2017.07.009.
10
Visualization and characterization of individual type III protein secretion machines in live bacteria.在活细菌中可视化和表征个体 III 型蛋白分泌机器。
Proc Natl Acad Sci U S A. 2017 Jun 6;114(23):6098-6103. doi: 10.1073/pnas.1705823114. Epub 2017 May 22.