Wagner Naama, Baumer Ella, Lyubman Iris, Shimony Yair, Bracha Noam, Martins Leonor, Potnis Neha, Chang Jeff H, Teper Doron, Koebnik Ralf, Pupko Tal
The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Chaim Levanon St 30, Tel Aviv, 69978, Israel.
CIBIO-Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO-Laboratório Associado, Universidade do Porto, Vairão, 4485-661, Portugal.
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf272.
Type III secretion systems are used by many Gram-negative bacteria to inject type 3 effectors (T3Es) directly into eukaryotic cells, promoting disease or provoking immune response. Because of these opposing evolutionary forces, T3E repertoires often vary within taxonomic groups. Identifying the full effector gene repertoire in genomes of related individuals is crucial for determining core and specialized effectors, understanding the disease dynamics, and developing appropriate management strategies against pathogens. It can also help uncover novel T3Es that have recently emerged in a population. Our previously published Effectidor web server successfully addressed the challenge of identifying T3Es in a single bacterial genome. Here, we enriched the web server with various novel capabilities, including the identification of T3Es from multiple genome sequences simultaneously.
We present Effectidor II, a web server that relies on machine learning to predict T3E-encoding genes within bacterial pan-genomes. We demonstrate the benefit of learning based on features extracted from the entire sequences comprising the pan-genome and report a novel T3E discovered by it in Xanthomonas euroxanthea.
Effectidor II is available at: https://effectidor.tau.ac.il and the source code is available at: https://github.com/naamawagner/Effectidor. A stand-alone version of Effectidor II is available at: https://github.com/naamawagner/Effectidor/tree/StandAlone. The source code for the standalone version and the data used in this work are also provided in https://doi.org/10.5281/zenodo.15081636.
许多革兰氏阴性细菌利用III型分泌系统将III型效应蛋白(T3E)直接注入真核细胞,引发疾病或激发免疫反应。由于这些相反的进化力量,T3E库在分类群中常常有所不同。确定相关个体基因组中的完整效应基因库对于确定核心和特异性效应蛋白、理解疾病动态以及制定针对病原体的适当管理策略至关重要。它还可以帮助发现群体中最近出现的新型T3E。我们之前发布的Effectidor网络服务器成功应对了在单个细菌基因组中识别T3E的挑战。在这里,我们为该网络服务器增添了各种新功能,包括同时从多个基因组序列中识别T3E。
我们展示了Effectidor II,这是一个依靠机器学习来预测细菌泛基因组中T3E编码基因的网络服务器。我们证明了基于从构成泛基因组的整个序列中提取的特征进行学习的益处,并报告了它在黄斑黄单胞菌中发现的一种新型T3E。
Effectidor II可在以下网址获取:https://effectidor.tau.ac.il ,其源代码可在以下网址获取:https://github.com/naamawagner/Effectidor 。Effectidor II的独立版本可在以下网址获取:https://github.com/naamawagner/Effectidor/tree/StandAlone 。独立版本的源代码以及本研究中使用的数据也可在https://doi.org/10.5281/zenodo.15081636获取。