Angaitkar Pratik, Janghel Rekh Ram, Sahu Tirath Prasad
Department of Information Technology, National Institute of Technology, Raipur, G.E. Road, Raipur, C.G. 492010 India.
3 Biotech. 2023 Sep;13(9):297. doi: 10.1007/s13205-023-03716-7. Epub 2023 Aug 9.
Prediction of conformational B-cell epitopes (CBCE) is an essential phase for vaccine design, drug invention, and accurate disease diagnosis. Many laboratorial and computational approaches have been developed to predict CBCE. However, laboratorial experiments are costly and time consuming, leading to the popularity of Machine Learning (ML)-based computational methods. Although ML methods have succeeded in many domains, achieving higher accuracy in CBCE prediction remains a challenge. To overcome this drawback and consider the limitations of ML methods, this paper proposes a novel DL-based framework for CBCE prediction, leveraging the capabilities of deep learning in the medical domain. The proposed model is named Deep Learning-based Temporal Convolutional Neural Network (DL-TCNN), which hybridizes empirical hyper-tuned 1D-CNN and TCN. TCN is an architecture that employs causal convolutions and dilations, adapting well to sequential input with extensive receptive fields. To train the proposed model, physicochemical features are firstly extracted from antigen sequences. Next, the Synthetic Minority Oversampling Technique (SMOTE) is applied to address the class imbalance problem. Finally, the proposed DL-TCNN is employed for the prediction of CBCE. The model's performance is evaluated and validated on a benchmark antigen-antibody dataset. The DL-TCNN achieves 94.44% accuracy, and 0.989 AUC score for the training dataset, 78.53% accuracy, and 0.661 AUC score for the validation dataset; and 85.10% accuracy, 0.855 AUC score for the testing dataset. The proposed model outperforms all the existing CBCE methods.
预测构象性B细胞表位(CBCE)是疫苗设计、药物研发和准确疾病诊断的关键阶段。人们已经开发了许多实验室方法和计算方法来预测CBCE。然而,实验室实验成本高且耗时,这使得基于机器学习(ML)的计算方法变得流行起来。尽管ML方法在许多领域都取得了成功,但在CBCE预测中实现更高的准确率仍然是一个挑战。为了克服这一缺点并考虑ML方法的局限性,本文提出了一种基于深度学习的新颖框架用于CBCE预测,利用深度学习在医学领域的能力。所提出的模型名为基于深度学习的时间卷积神经网络(DL-TCNN),它将经验性超参数调整的一维卷积神经网络(1D-CNN)和时间卷积网络(TCN)进行了融合。TCN是一种采用因果卷积和扩张的架构,能很好地适应具有广泛感受野的序列输入。为了训练所提出的模型,首先从抗原序列中提取物理化学特征。接下来,应用合成少数类过采样技术(SMOTE)来解决类别不平衡问题。最后,使用所提出的DL-TCNN进行CBCE预测。在一个基准抗原-抗体数据集上对模型的性能进行了评估和验证。DL-TCNN在训练数据集上的准确率达到94.44%,AUC分数为0.989;在验证数据集上的准确率为78.53%,AUC分数为0.661;在测试数据集上的准确率为85.10%,AUC分数为0.855。所提出的模型优于所有现有的CBCE方法。