Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India.
Protein Sci. 2023 Dec;32(12):e4808. doi: 10.1002/pro.4808.
Virulence proteins in pathogens are essential for causing disease in a host. They enable the pathogen to invade, survive and multiply within the host, thus enhancing its potential to cause disease while also causing evasion of host defense mechanisms. Identifying these factors, especially potential vaccine candidates or drug targets, is critical for vaccine or drug development research. In this context, we present an improved version of VirulentPred 1.0 for rapidly identifying virulent proteins. The VirulentPred 2.0 is based on training machine learning models with experimentally validated virulent protein sequences. VirulentPred 2.0 achieved 84.71% accuracy with the validation dataset and 85.18% on an independent test dataset. The models are trained and evaluated with the latest sequence datasets of virulent proteins, which are three times greater in number than the proteins used in the earlier version of VirulentPred. Moreover, a significant improvement of 11% in the prediction accuracy over the earlier version is achieved with the best position-specific scoring matrix (PSSM)-based model for the latest test dataset. VirulentPred 2.0 is available as a user-friendly web interface at https://bioinfo.icgeb.res.in/virulent2/ and a standalone application suitable for bulk predictions. With higher efficiency and availability as a standalone tool, VirulentPred 2.0 holds immense potential for high throughput yet efficient identification of virulent proteins in bacterial pathogens.
病原体中的毒力蛋白对于在宿主中引起疾病是必不可少的。它们使病原体能够在宿主内部入侵、存活和繁殖,从而增强其引起疾病的潜力,同时逃避宿主防御机制。鉴定这些因素,特别是潜在的疫苗候选物或药物靶点,对于疫苗或药物开发研究至关重要。在这种情况下,我们提出了 VirulentPred 1.0 的改进版本,用于快速识别毒力蛋白。VirulentPred 2.0 基于使用经过实验验证的毒力蛋白序列训练机器学习模型。VirulentPred 2.0 在验证数据集上的准确率为 84.71%,在独立测试数据集上的准确率为 85.18%。这些模型使用最新的毒力蛋白序列数据集进行训练和评估,这些数据集的数量是 VirulentPred 早期版本中使用的蛋白质的三倍。此外,对于最新的测试数据集,基于最佳位置特异性评分矩阵(PSSM)的模型的预测准确性比早期版本提高了 11%。VirulentPred 2.0 可作为用户友好的网络界面在 https://bioinfo.icgeb.res.in/virulent2/ 上使用,也可作为适用于批量预测的独立应用程序使用。作为一个独立的工具,VirulentPred 2.0 具有更高的效率和可用性,具有巨大的潜力,可以高通量、高效地识别细菌病原体中的毒力蛋白。
BMC Bioinformatics. 2008-1-28
PLoS One. 2012-8-3
Bioinformatics. 2009-6-15
Front Cell Infect Microbiol. 2024-12-13
Braz J Microbiol. 2024-12
Front Mol Biosci. 2024-8-13
Heliyon. 2024-7-18
Nucleic Acids Res. 2022-1-7
Nucleic Acids Res. 2021-1-8
Proteins. 2020-3
Nucleic Acids Res. 2019-1-8
Biomed Pharmacother. 2017-7-12
BMC Bioinformatics. 2017-6-2
Antimicrob Resist Infect Control. 2017-5-15