PepTCR-Net：利用深度学习通过T细胞受体序列预测多类抗原肽

PepTCR-Net: prediction of multi-class antigen peptides by T-cell receptor sequences with deep learning.

作者信息

Le Phi, Ung Leah, Yang Hai, Huang Anwen, He Tao, Bruno Peter, Oh David Y, Keenan Bridget P, Zhang Li

机构信息

Department of Medicine, University of California San Francisco, 550 16th Street, San Francisco, CA 94158, United States.

Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, 1450 3rd St. San Francisco, CA 94158, United States.

出版信息

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf351.

DOI:10.1093/bib/bbaf351

PMID:40702702

Abstract

Predicting T-cell receptor (TCR) recognizing antigen peptides is crucial for understanding the immune system and developing new treatments for cancer, infectious and autoimmune diseases. As experimental methods for identifying TCR-antigen recognition are expensive and time-consuming, machine-learning approaches are increasingly used. However, existing computational tools often struggle with generalization due to limited data and challenges in acquiring true non-recognition pairs and rarely integrate multiple biological features into unified frameworks. To address these challenges, we propose a two-step framework for predicting TCR-antigen recognition. The first step focuses on feature engineering: neural network-based embeddings of letter-based TCR and peptide sequences inspired by language models, and categorical encoding of Human Leukocyte Antigen types and Variable/Joining genes. In the second step, we built a prediction model to assess the likelihood of TRC-antigen recognition by a Bayesian Feedforward Neural Network. We trained and validated the framework using large public databases. Our results demonstrate that our advanced feature engineering delivers strong predictive performance both internally and externally. We applied the framework to a real-world case for predicting whether specific TCRs can recognize SARS-CoV-2 epitope peptides, demonstrating that our framework can function as a de novo TCR-antigen prediction tool applicable to infectious diseases.

摘要

预测T细胞受体（TCR）识别抗原肽对于理解免疫系统以及开发针对癌症、传染病和自身免疫性疾病的新疗法至关重要。由于识别TCR-抗原识别的实验方法既昂贵又耗时，机器学习方法的使用越来越多。然而，由于数据有限以及获取真正的非识别对存在挑战，现有的计算工具往往难以实现泛化，并且很少将多种生物学特征整合到统一框架中。为应对这些挑战，我们提出了一个用于预测TCR-抗原识别的两步框架。第一步侧重于特征工程：受语言模型启发的基于字母的TCR和肽序列的基于神经网络的嵌入，以及人类白细胞抗原类型和可变/连接基因的分类编码。在第二步中，我们构建了一个预测模型，通过贝叶斯前馈神经网络评估TRC-抗原识别的可能性。我们使用大型公共数据库对该框架进行了训练和验证。我们的结果表明，我们先进的特征工程在内部和外部都具有强大的预测性能。我们将该框架应用于一个实际案例，以预测特定的TCR是否能够识别SARS-CoV-2表位肽，这表明我们的框架可以作为一种适用于传染病的全新TCR-抗原预测工具。

相似文献

PepTCR-Net: prediction of multi-class antigen peptides by T-cell receptor sequences with deep learning.PepTCR-Net：利用深度学习通过T细胞受体序列预测多类抗原肽

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf351.

TCR-epiDiff: solving dual challenges of TCR generation and binding prediction.TCR-epiDiff：解决TCR生成和结合预测的双重挑战。

Bioinformatics. 2025 Jul 1;41(Supplement_1):i125-i132. doi: 10.1093/bioinformatics/btaf202.

Iterative attack-and-defend framework for improving TCR-epitope binding prediction models.用于改进TCR-表位结合预测模型的迭代攻防框架。

Bioinformatics. 2025 Jul 1;41(Supplement_1):i429-i438. doi: 10.1093/bioinformatics/btaf224.

Short-Term Memory Impairment短期记忆障碍

iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.iACP-DPNet：一种用于可解释抗癌肽识别的双池因果扩张卷积网络。

Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

G2VTCR: predicting antigen binding specificity by Weisfeiler-Lehman graph embedding of T cell receptor sequences.G2VTCR：通过T细胞受体序列的魏斯费勒-莱曼图嵌入预测抗原结合特异性

bioRxiv. 2025 May 4:2025.04.29.651344. doi: 10.1101/2025.04.29.651344.

Advancing the Accuracy of Anti-MRSA Peptide Prediction Through Integrating Multi-Source Protein Language Models.通过整合多源蛋白质语言模型提高抗耐甲氧西林金黄色葡萄球菌肽预测的准确性

Interdiscip Sci. 2025 Mar 11. doi: 10.1007/s12539-025-00696-5.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

In vitro machine learning-based CAR T immunological synapse quality measurements correlate with patient clinical outcomes.基于体外机器学习的 CAR T 免疫突触质量测量与患者临床结果相关。

PLoS Comput Biol. 2022 Mar 18;18(3):e1009883. doi: 10.1371/journal.pcbi.1009883. eCollection 2022 Mar.

本文引用的文献

A large-scale database of T-cell receptor beta sequences and binding associations from natural and synthetic exposure to SARS-CoV-2.一个来自自然和合成暴露于新冠病毒的T细胞受体β序列及结合关联的大规模数据库。

Front Immunol. 2025 Feb 17;16:1488851. doi: 10.3389/fimmu.2025.1488851. eCollection 2025.

Magnitude and dynamics of the T-cell response to SARS-CoV-2 infection at both individual and population levels.个体和群体水平上针对SARS-CoV-2感染的T细胞反应的强度和动态变化。

Front Immunol. 2025 Jan 7;15:1488860. doi: 10.3389/fimmu.2024.1488860. eCollection 2024.

Nonconserved epitopes dominate reverse preexisting T cell immunity in COVID-19 convalescents.非保守表位主导 COVID-19 恢复期患者的反向预先存在 T 细胞免疫。

Signal Transduct Target Ther. 2024 Jun 12;9(1):160. doi: 10.1038/s41392-024-01876-3.

NAIR: Network Analysis of Immune Repertoire.NAIR：免疫受体的网络分析。

Front Immunol. 2023 Jul 7;14:1181825. doi: 10.3389/fimmu.2023.1181825. eCollection 2023.

ATM-TCR: TCR-Epitope Binding Affinity Prediction Using a Multi-Head Self-Attention Model.ATM-TCR：使用多头自注意力模型预测 TCR-表位结合亲和力。

Front Immunol. 2022 Jul 6;13:893247. doi: 10.3389/fimmu.2022.893247. eCollection 2022.

Checkpoint blockade-induced CD8+ T cell differentiation in head and neck cancer responders.在头颈部肿瘤应答者中，检查点阻断诱导的 CD8+ T 细胞分化。

J Immunother Cancer. 2022 Jan;10(1). doi: 10.1136/jitc-2021-004034.

Classification of imbalanced oral cancer image data from high-risk population.高危人群口腔癌图像数据的不平衡分类。

J Biomed Opt. 2021 Oct;26(10). doi: 10.1117/1.JBO.26.10.105001.

NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCRα and β sequence data.NetTCR-2.0 通过使用配对的 TCRα 和β 序列数据实现了 TCR-肽结合的准确预测。

Commun Biol. 2021 Sep 10;4(1):1060. doi: 10.1038/s42003-021-02610-3.

Contribution of T Cell Receptor Alpha and Beta CDR3, MHC Typing, V and J Genes to Peptide Binding Prediction.T 细胞受体α和β CDR3、MHC 分型、V 和 J 基因对肽结合预测的贡献。

Front Immunol. 2021 Apr 26;12:664514. doi: 10.3389/fimmu.2021.664514. eCollection 2021.

SARS-CoV-2 genome-wide T cell epitope mapping reveals immunodominance and substantial CD8 T cell activation in COVID-19 patients.SARS-CoV-2 全基因组 T 细胞表位图谱分析揭示了 COVID-19 患者中的免疫优势和大量 CD8+T 细胞激活。

Sci Immunol. 2021 Apr 14;6(58). doi: 10.1126/sciimmunol.abf7550.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PepTCR-Net：利用深度学习通过T细胞受体序列预测多类抗原肽

PepTCR-Net: prediction of multi-class antigen peptides by T-cell receptor sequences with deep learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献