Institute of Science and Technology for Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China.
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto Prefecture 611-0011, Japan.
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad551.
Computationally predicting major histocompatibility complex class I (MHC-I) peptide binding affinity is an important problem in immunological bioinformatics, which is also crucial for the identification of neoantigens for personalized therapeutic cancer vaccines. Recent cutting-edge deep learning-based methods for this problem cannot achieve satisfactory performance, especially for non-9-mer peptides. This is because such methods generate the input by simply concatenating the two given sequences: a peptide and (the pseudo sequence of) an MHC class I molecule, which cannot precisely capture the anchor positions of the MHC binding motif for the peptides with variable lengths. We thus developed an anchor position-aware and high-performance deep model, DeepMHCI, with a position-wise gated layer and a residual binding interaction convolution layer. This allows the model to control the information flow in peptides to be aware of anchor positions and model the interactions between peptides and the MHC pseudo (binding) sequence directly with multiple convolutional kernels.
The performance of DeepMHCI has been thoroughly validated by extensive experiments on four benchmark datasets under various settings, such as 5-fold cross-validation, validation with the independent testing set, external HPV vaccine identification, and external CD8+ epitope identification. Experimental results with visualization of binding motifs demonstrate that DeepMHCI outperformed all competing methods, especially on non-9-mer peptides binding prediction.
DeepMHCI is publicly available at https://github.com/ZhuLab-Fudan/DeepMHCI.
计算主要组织相容性复合体 I 类 (MHC-I) 肽结合亲和力是免疫生物信息学中的一个重要问题,对于鉴定用于个性化治疗性癌症疫苗的新抗原也至关重要。最近基于深度学习的前沿方法在解决这个问题时无法达到令人满意的性能,特别是对于非 9 -mer 肽。这是因为这些方法通过简单地将两个给定的序列(一个肽和(MHC 类 I 分子的伪序列))串联起来生成输入,因此无法准确捕捉具有可变长度的肽的 MHC 结合基序的锚定位置。因此,我们开发了一种具有锚定位置感知能力和高性能的深度学习模型 DeepMHCI,该模型具有位置门控层和残差结合相互作用卷积层。这使得模型能够控制肽中的信息流,以感知锚定位置,并直接使用多个卷积核来模拟肽与 MHC 伪(结合)序列之间的相互作用。
在各种设置下(例如 5 折交叉验证、使用独立测试集的验证、外部 HPV 疫苗识别和外部 CD8+表位识别),通过对四个基准数据集进行广泛的实验,彻底验证了 DeepMHCI 的性能。结合基序的可视化实验结果表明,DeepMHCI 优于所有竞争方法,尤其是在非 9-mer 肽结合预测方面。
DeepMHCI 可在 https://github.com/ZhuLab-Fudan/DeepMHCI 上公开获取。