Atta-ur-Rahman School of Applied Biosciences (ASAB), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
School of Interdisciplinary Engineering & Science (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
Vaccine. 2024 Sep 17;42(22):126204. doi: 10.1016/j.vaccine.2024.126204. Epub 2024 Aug 9.
The ESKAPE family, comprising Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter spp., poses a significant global threat due to their heightened virulence and extensive antibiotic resistance. These pathogens contribute largely to the prevalence of nosocomial or hospital-acquired infections, resulting in high morbidity and mortality rates. To tackle this healthcare problem urgent measures are needed, including development of innovative vaccines and therapeutic strategies. Designing vaccines involves a complex and resource-intensive process of identifying protective antigens and potential vaccine candidates (PVCs) from pathogens. Reverse vaccinology (RV), an approach based on genomics, made this process more efficient by leveraging bioinformatics tools to identify potential vaccine candidates. In recent years, artificial intelligence and machine learning (ML) techniques has shown promise in enhancing the accuracy and efficiency of reverse vaccinology. This study introduces a supervised ML classification framework, to predict potential vaccine candidates specifically against ESKAPE pathogens. The model's training utilized biological and physicochemical properties from a dataset containing protective antigens and non-protective proteins of ESKAPE pathogens. Conventional autoencoders based strategy was employed for feature encoding and selection. During the training process, seven machine learning algorithms were trained and subjected to Stratified 5-fold Cross Validation. Random Forest and Logistic Regression exhibited best performance in various metrics including accuracy, precision, recall, WF1 score, and Area under the curve. An ensemble model was developed, to take collective strengths of both the algorithms. To assess efficacy of our final ensemble model, a high-quality benchmark dataset was employed. VacSol-ML demonstrated outstanding discrimination between protective vaccine candidates (PVCs) and non-protective antigens. VacSol-ML, proves to be an invaluable tool in expediting vaccine development for these pathogens. Accessible to the public through both a web server and standalone version, it encourages collaborative research. The web-based and standalone tools are available at http://vacsolml.mgbio.tech/.
ESKAPE 家族包括屎肠球菌、金黄色葡萄球菌、肺炎克雷伯菌、鲍曼不动杆菌、铜绿假单胞菌和肠杆菌属,由于其高毒力和广泛的抗生素耐药性,对全球构成重大威胁。这些病原体在很大程度上导致了医院获得性感染的流行,导致发病率和死亡率居高不下。为了解决这个医疗保健问题,需要采取紧急措施,包括开发创新疫苗和治疗策略。疫苗设计涉及一个复杂且资源密集的过程,需要从病原体中识别保护性抗原和潜在疫苗候选物 (PVC)。基于基因组学的反向疫苗学 (RV) 通过利用生物信息学工具来识别潜在的疫苗候选物,使这一过程更加高效。近年来,人工智能和机器学习 (ML) 技术在提高反向疫苗学的准确性和效率方面显示出了很大的希望。本研究提出了一个有监督的 ML 分类框架,用于专门针对 ESKAPE 病原体预测潜在的疫苗候选物。该模型的训练利用了包含 ESKAPE 病原体保护性抗原和非保护性蛋白的数据集的生物学和物理化学特性。采用基于传统自动编码器的策略进行特征编码和选择。在训练过程中,对 7 种机器学习算法进行了训练,并进行了分层 5 折交叉验证。随机森林和逻辑回归在准确性、精度、召回率、WF1 评分和曲线下面积等各种指标上表现出最佳性能。开发了一个集成模型,以充分利用两种算法的优势。为了评估我们最终集成模型的效果,使用了一个高质量的基准数据集。VacSol-ML 在区分保护性疫苗候选物 (PVC) 和非保护性抗原方面表现出色。VacSol-ML 证明是加速这些病原体疫苗开发的宝贵工具。它可以通过网络服务器和独立版本两种方式供公众使用,鼓励合作研究。基于网络的和独立的工具可在 http://vacsolml.mgbio.tech/ 上获得。