Department of Microbiology, University of Hong Kong, Hong Kong, China.
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang, China.
BMC Biol. 2024 Nov 14;22(1):259. doi: 10.1186/s12915-024-02064-z.
The type IV secretion system is widely present in various bacteria, such as Salmonella, Escherichia coli, and Helicobacter pylori. These bacteria use the type IV secretion system to secrete type IV secretion effectors, infect host cells, and disrupt or modulate the communication pathways. In this study, type III and type VI secretion effectors were used as negative samples to train a robust model.
The area under the curve of T4Seeker on the validation and independent test sets were 0.947 and 0.970, respectively, demonstrating the strong predictive capacity and robustness of T4Seeker. After comparing with the classic and state-of-the-art T4SE identification models, we found that T4Seeker, which is based on traditional features and large language model features, had a higher predictive ability.
The T4Seeker proposed in this study demonstrates superior performance in the field of T4SEs prediction. By integrating features at multiple levels, it achieves higher predictive accuracy and strong generalization capability, providing an effective tool for future T4SE research.
IV 型分泌系统广泛存在于各种细菌中,如沙门氏菌、大肠杆菌和幽门螺杆菌。这些细菌利用 IV 型分泌系统分泌 IV 型分泌效应物,感染宿主细胞,并破坏或调节通讯途径。在这项研究中,III 型和 VI 型分泌效应物被用作阴性样本来训练一个强大的模型。
T4Seeker 在验证集和独立测试集上的曲线下面积分别为 0.947 和 0.970,表明 T4Seeker 具有很强的预测能力和稳健性。与经典和最先进的 T4SE 识别模型进行比较后,我们发现基于传统特征和大型语言模型特征的 T4Seeker 具有更高的预测能力。
本研究提出的 T4Seeker 在 T4SEs 预测领域表现出优异的性能。通过整合多个层次的特征,它实现了更高的预测准确性和强大的泛化能力,为未来的 T4SE 研究提供了有效的工具。