IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598USA.
Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029.
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad504.
Immunologic recognition of peptide antigens bound to class I major histocompatibility complex (MHC) molecules is essential to both novel immunotherapeutic development and human health at large. Current methods for predicting antigen peptide immunogenicity rely primarily on simple sequence representations, which allow for some understanding of immunogenic features but provide inadequate consideration of the full scale of molecular mechanisms tied to peptide recognition. We here characterize contributions that unsupervised and supervised artificial intelligence (AI) methods can make toward understanding and predicting MHC(HLA-A2)-peptide complex immunogenicity when applied to large ensembles of molecular dynamics simulations. We first show that an unsupervised AI method allows us to identify subtle features that drive immunogenicity differences between a cancer neoantigen and its wild-type peptide counterpart. Next, we demonstrate that a supervised AI method for class I MHC(HLA-A2)-peptide complex classification significantly outperforms a sequence model on small datasets corrected for trivial sequence correlations. Furthermore, we show that both unsupervised and supervised approaches reveal determinants of immunogenicity based on time-dependent molecular fluctuations and anchor position dynamics outside the MHC binding groove. We discuss implications of these structural and dynamic immunogenicity correlates for the induction of T cell responses and therapeutic T cell receptor design.
免疫识别肽抗原结合到 I 类主要组织相容性复合体 (MHC) 分子对新的免疫治疗发展和人类健康至关重要。目前预测抗原肽免疫原性的方法主要依赖于简单的序列表示,这些方法允许对免疫特征有一些了解,但对与肽识别相关的分子机制的全貌考虑不足。我们在这里描述了无监督和有监督的人工智能 (AI) 方法在应用于大规模分子动力学模拟时对理解和预测 MHC(HLA-A2)-肽复合物免疫原性的贡献。我们首先表明,一种无监督的 AI 方法使我们能够识别驱动癌症新抗原与其野生型肽对应物之间免疫原性差异的细微特征。接下来,我们证明了一种用于 I 类 MHC(HLA-A2)-肽复合物分类的有监督 AI 方法在经过 trivial 序列相关性校正的小数据集上显著优于序列模型。此外,我们还表明,无监督和有监督的方法都可以根据依赖于时间的分子波动和 MHC 结合槽外的锚定位置动力学来揭示免疫原性的决定因素。我们讨论了这些结构和动态免疫相关性对 T 细胞反应诱导和治疗性 T 细胞受体设计的影响。