APEX-pHLA：一种用于准确预测外源性短肽与 HLA Ⅰ类分子结合的新方法。

APEX-pHLA: A novel method for accurate prediction of the binding between exogenous short peptides and HLA class I molecules.

机构信息

College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, Zhejiang 310014, China.

出版信息

Methods. 2024 Aug;228:38-47. doi: 10.1016/j.ymeth.2024.05.013. Epub 2024 May 19.

DOI:10.1016/j.ymeth.2024.05.013

Abstract

Human leukocyte antigen (HLA) molecules play critically significant role within the realm of immunotherapy due to their capacities to recognize and bind exogenous antigens such as peptides, subsequently delivering them to immune cells. Predicting the binding between peptides and HLA molecules (pHLA) can expedite the screening of immunogenic peptides and facilitate vaccine design. However, traditional experimental methods are time-consuming and inefficient. In this study, an efficient method based on deep learning was developed for predicting peptide-HLA binding, which treated peptide sequences as linguistic entities. It combined the architectures of textCNN and BiLSTM to create a deep neural network model called APEX-pHLA. This model operated without limitations related to HLA class I allele variants and peptide segment lengths, enabling efficient encoding of sequence features for both HLA and peptide segments. On the independent test set, the model achieved Accuracy, ROC_AUC, F1, and MCC is 0.9449, 0.9850, 0.9453, and 0.8899, respectively. Similarly, on an external test set, the results were 0.9803, 0.9574, 0.8835, and 0.7863, respectively. These findings outperformed fifteen methods previously reported in the literature. The accurate prediction capability of the APEX-pHLA model in peptide-HLA binding might provide valuable insights for future HLA vaccine design.

摘要

人类白细胞抗原（HLA）分子在免疫治疗领域中发挥着至关重要的作用，因为它们能够识别和结合外源性抗原，如肽，然后将其递送给免疫细胞。预测肽与 HLA 分子的结合（pHLA）可以加速免疫原性肽的筛选，并促进疫苗设计。然而，传统的实验方法既耗时又低效。在这项研究中，开发了一种基于深度学习的高效方法来预测肽-HLA 结合，将肽序列视为语言实体。它结合了 textCNN 和 BiLSTM 的架构，创建了一个名为 APEX-pHLA 的深度神经网络模型。该模型不受 HLA Ⅰ类等位基因变体和肽段长度的限制，能够有效地对 HLA 和肽段的序列特征进行编码。在独立测试集上，该模型的准确率、ROC_AUC、F1 和 MCC 分别为 0.9449、0.9850、0.9453 和 0.8899。同样，在外部测试集上，结果分别为 0.9803、0.9574、0.8835 和 0.7863。这些结果优于之前文献中报道的十五种方法。APEX-pHLA 模型在肽-HLA 结合中的准确预测能力可能为未来的 HLA 疫苗设计提供有价值的见解。