Niu Rui, Wang Jingwei, Li Yanli, Zhou Jiren, Guo Yang, Shang Xuequn
School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129 Shaanxi, China.
John Curtin School of Medical Research, The Australian National University, Canberra, ACT 2600, Australia.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf038.
The identification of neoantigens is crucial for advancing vaccines, diagnostics, and immunotherapies. Despite this importance, a fundamental question remains: how to model the presentation of neoantigens by major histocompatibility complex class I molecules and the recognition of the peptide-MHC-I (pMHC-I) complex by T cell receptors (TCRs). Accurate prediction of pMHC-I binding and TCR recognition remains a significant computational challenge in immunology due to intricate binding motifs and the long-tail distribution of known binding pairs in public databases. Here, we propose an attention-aware framework comprising TranspMHC for pMHC-I binding prediction and TransTCR for TCR-pMHC-I recognition prediction. Leveraging the attention mechanism, TranspMHC surpasses existing algorithms on independent datasets at both pan-specific and allele-specific levels. For TCR-pMHC-I recognition, TransTCR incorporates transfer learning and a differential learning strategy, demonstrating superior performance and enhanced generalization on independent datasets compared to existing methods. Furthermore, we identify key amino acids associated with binding motifs of peptides and TCRs that facilitate pMHC-I and TCR-pMHC-I binding, indicating the potential interpretability of our proposed framework.
新抗原的鉴定对于推进疫苗、诊断方法和免疫疗法至关重要。尽管其重要性不言而喻,但一个基本问题仍然存在:如何模拟主要组织相容性复合体I类分子对新抗原的呈递以及T细胞受体(TCR)对肽-MHC-I(pMHC-I)复合物的识别。由于复杂的结合基序以及公共数据库中已知结合对的长尾分布,准确预测pMHC-I结合和TCR识别在免疫学中仍然是一项重大的计算挑战。在此,我们提出了一个注意力感知框架,该框架包括用于pMHC-I结合预测的TranspMHC和用于TCR-pMHC-I识别预测的TransTCR。利用注意力机制,TranspMHC在泛特异性和等位基因特异性水平上均在独立数据集上超越了现有算法。对于TCR-pMHC-I识别,TransTCR结合了迁移学习和差异学习策略,与现有方法相比,在独立数据集上表现出卓越的性能和更强的泛化能力。此外,我们确定了与肽和TCR的结合基序相关的关键氨基酸,这些氨基酸有助于pMHC-I和TCR-pMHC-I结合,这表明我们提出的框架具有潜在的可解释性。