Krishnaswamy Smita, Givechian Kevin, Rocha João, Yang Edward, Liu Chen, Greene Kerrie, Ying Rex, Caron Etienne, Iwasaki Akiko
Res Sq. 2025 May 21:rs.3.rs-6606336. doi: 10.21203/rs.3.rs-6606336/v1.
Epitope-based vaccines are promising therapeutic modalities for infectious diseases and cancer, but identifying immunogenic epitopes is challenging. The vast majority of prediction methods only use amino acid sequence information, and do not incorporate wide-scale structure data and biochemical properties across each peptide-MHC. We present ImmunoStruct, a deep-learning model that integrates sequence, structural, and biochemical information to predict multi-allele class-I peptide-MHC immunogenicity. By leveraging a multimodal dataset of ∼27,000 peptide-MHCs, we demonstrate that ImmunoStruct improves immunogenicity prediction performance and interpretability beyond existing methods, across infectious disease epitopes and cancer neoepitopes. We further show strong alignment with assay results for a set of SARS-CoV-2 epitopes, as well as strong performance in peptide-MHC-based cancer patient survival prediction. Overall, this work also presents a new architecture that incorporates equivariant graph processing and multimodal data integration for the long standing task in immunotherapy.
基于表位的疫苗是治疗传染病和癌症的有前景的治疗方式,但识别免疫原性表位具有挑战性。绝大多数预测方法仅使用氨基酸序列信息,并未纳入每个肽 - MHC的大规模结构数据和生化特性。我们提出了ImmunoStruct,这是一种深度学习模型,它整合了序列、结构和生化信息来预测多等位基因I类肽 - MHC的免疫原性。通过利用约27,000个肽 - MHC的多模态数据集,我们证明ImmunoStruct在传染病表位和癌症新表位方面,超越现有方法提高了免疫原性预测性能和可解释性。我们进一步表明,它与一组SARS-CoV-2表位的检测结果高度一致,并且在基于肽 - MHC的癌症患者生存预测中表现出色。总体而言,这项工作还提出了一种新的架构,该架构结合了等变图处理和多模态数据集成,用于免疫治疗中的长期任务。