Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, China.
Genmab US, Inc., Princeton, New Jersey 08540, United States.
J Chem Inf Model. 2023 Dec 11;63(23):7557-7567. doi: 10.1021/acs.jcim.3c01293. Epub 2023 Nov 22.
Identifying the interactions between T-cell receptor (TCRs) and human antigens is a crucial step in developing new vaccines, diagnostics, and immunotherapy. Current methods primarily focus on learning binding patterns from known TCR binding repertoires by using sequence information alone without considering the binding specificity of new antigens or exogenous peptides that have not appeared in the training set. Furthermore, the spatial structure of antigens plays a critical role in immune studies and immunotherapy, which should be addressed properly in the identification of interacting TCR-antigen pairs. In this study, we introduced a novel deep learning framework based on generative graph structures, GGNpTCR, for predicting interactions between TCR and peptides from sequence information. Results of real data analysis indicate that our model achieved excellent prediction for new antigens unseen in the training data set, making significant improvements compared to existing methods. We also applied the model to a large COVID-19 data set with no antigens in the training data set, and the improvement was also significant. Furthermore, through incorporation of additional supervised mechanisms, GGNpTCR demonstrated the ability to precisely forecast the locations of peptide-TCR interactions within 3D configurations. This enhancement substantially improved the model's interpretability. In summary, based on the performance on multiple data sets, GGNpTCR has made significant progress in terms of performance, universality, and interpretability.
鉴定 T 细胞受体 (TCRs) 与人类抗原之间的相互作用是开发新型疫苗、诊断和免疫疗法的关键步骤。目前的方法主要侧重于仅使用序列信息从已知的 TCR 结合库中学习结合模式,而不考虑新抗原或未出现在训练集中的外源性肽的结合特异性。此外,抗原的空间结构在免疫研究和免疫疗法中起着至关重要的作用,在鉴定相互作用的 TCR-抗原对时应正确处理。在这项研究中,我们引入了一种基于生成图结构的新型深度学习框架,GGNpTCR,用于从序列信息预测 TCR 与肽之间的相互作用。真实数据分析结果表明,我们的模型对训练数据集中未见过的新抗原实现了出色的预测,与现有方法相比取得了显著的改进。我们还将该模型应用于一个大型 COVID-19 数据集,其中训练数据集中没有抗原,改进也非常显著。此外,通过纳入额外的监督机制,GGNpTCR 还展示了精确预测肽-TCR 在 3D 结构内相互作用位置的能力。这种增强大大提高了模型的可解释性。总之,基于多个数据集的性能,GGNpTCR 在性能、通用性和可解释性方面都取得了重大进展。