Amity Institute of Biotechnology, Amity University, Uttar Pradesh, India.
Texas Children's Center for Vaccine Development, Departments of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA.
Comput Biol Med. 2022 Jun;145:105401. doi: 10.1016/j.compbiomed.2022.105401. Epub 2022 Mar 22.
The development of a new vaccine is a challenging exercise involving several steps including computational studies, experimental work, and animal studies followed by clinical studies. To accelerate the process, in silico screening is frequently used for antigen identification. Here, we present Vaxi-DL, web-based deep learning (DL) software that evaluates the potential of protein sequences to serve as vaccine target antigens. Four different DL pathogen models were trained to predict target antigens in bacteria, protozoa, fungi, and viruses that cause infectious diseases in humans. Datasets containing antigenic and non-antigenic sequences were derived from known vaccine candidates and the Protegen database. Biological and physicochemical properties were computed for the datasets using publicly available bioinformatics tools. For each of the four pathogen models, the datasets were divided into training, validation, and testing subsets and then scaled and normalised. The models were constructed using Fully Connected Layers (FCLs), hyper-tuned, and trained using the training subset. Accuracy, sensitivity, specificity, precision, recall, and AUC (Area under the Curve) were used as metrics to assess the performance of these models. The models were benchmarked using independent datasets of known target antigens against other prediction tools such as VaxiJen and Vaxign-ML. We also tested Vaxi-DL on 219 known potential vaccine candidates (PVC) from 37 different pathogens. Our tool predicted 175 PVCs correctly out of 219 sequences. We also tested Vaxi-DL on different datasets obtained from multiple resources. Our tool has demonstrated an average sensitivity of 93% and will thus be a useful tool for prioritising PVCs for preclinical studies.
新型疫苗的研发是一项充满挑战的工作,需要经过多个步骤,包括计算研究、实验工作、动物研究,然后是临床研究。为了加速这一过程,经常使用计算机筛选进行抗原鉴定。在这里,我们介绍了 Vaxi-DL,这是一种基于网络的深度学习(DL)软件,可评估蛋白质序列作为疫苗靶抗原的潜力。我们训练了四个不同的 DL 病原体模型,以预测引起人类传染病的细菌、原生动物、真菌和病毒中的靶抗原。抗原性和非抗原性序列数据集源自已知的疫苗候选物和 Protegen 数据库。使用公共生物信息学工具计算了数据集的生物学和物理化学特性。对于这四个病原体模型,数据集被分为训练集、验证集和测试集,然后进行缩放和归一化。使用全连接层(FCL)构建模型,超调,然后使用训练集进行训练。使用准确性、敏感性、特异性、精度、召回率和 AUC(曲线下面积)作为评估这些模型性能的指标。我们使用已知靶抗原的独立数据集对这些模型进行了基准测试,这些数据集与其他预测工具(如 VaxiJen 和 Vaxign-ML)进行了比较。我们还在 37 种不同病原体的 219 种已知潜在疫苗候选物(PVC)上测试了 Vaxi-DL。我们的工具正确预测了 219 个序列中的 175 个 PVC。我们还在来自多个资源的不同数据集上测试了 Vaxi-DL。我们的工具平均灵敏度为 93%,因此将成为对 PVC 进行临床前研究的有用工具。