• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TCR-H:在未见数据集上解释性机器学习预测 T 细胞受体表位结合

TCR-H: explainable machine learning prediction of T-cell receptor epitope binding on unseen datasets.

机构信息

UT/ORNL Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, TN, United States.

Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN, United States.

出版信息

Front Immunol. 2024 Aug 16;15:1426173. doi: 10.3389/fimmu.2024.1426173. eCollection 2024.

DOI:10.3389/fimmu.2024.1426173
PMID:39221256
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11361934/
Abstract

Artificial-intelligence and machine-learning (AI/ML) approaches to predicting T-cell receptor (TCR)-epitope specificity achieve high performance metrics on test datasets which include sequences that are also part of the training set but fail to generalize to test sets consisting of epitopes and TCRs that are absent from the training set, i.e., are 'unseen' during training of the ML model. We present TCR-H, a supervised classification Support Vector Machines model using physicochemical features trained on the largest dataset available to date using only experimentally validated non-binders as negative datapoints. TCR-H exhibits an area under the curve of the receiver-operator characteristic (AUC of ROC) of 0.87 for epitope 'hard splitting' (i.e., on test sets with all epitopes unseen during ML training), 0.92 for TCR hard splitting and 0.89 for 'strict splitting' in which neither the epitopes nor the TCRs in the test set are seen in the training data. Furthermore, we employ the SHAP (Shapley additive explanations) eXplainable AI (XAI) method for interrogation to interpret the models trained with different hard splits, shedding light on the key physiochemical features driving model predictions. TCR-H thus represents a significant step towards general applicability and explainability of epitope:TCR specificity prediction.

摘要

人工智能和机器学习 (AI/ML) 方法在预测 T 细胞受体 (TCR)-表位特异性方面在测试数据集上取得了高性能指标,这些数据集包括也属于训练集的序列,但无法推广到测试集,因为测试集中的表位和 TCR 不在训练集中,即,在 ML 模型的训练过程中是“看不见的”。我们提出了 TCR-H,这是一种基于监督分类支持向量机的模型,使用基于目前最大数据集的物理化学特征进行训练,仅将实验验证的非结合物用作负数据点。TCR-H 在表位“硬分割”(即在测试集中,所有表位在 ML 训练期间都未被看到)的接收者操作特征曲线下面积 (ROC 的 AUC) 为 0.87,TCR 硬分割为 0.92,“严格分割”为 0.89,其中测试集中的表位和 TCR 都未在训练数据中看到。此外,我们还采用了 SHAP(Shapley Additive Explanations)可解释 AI(XAI)方法进行询问,以解释不同硬分割训练的模型,阐明驱动模型预测的关键物理化学特征。因此,TCR-H 代表了朝着普遍适用性和可解释性的表位:TCR 特异性预测迈出了重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/edb042e5b021/fimmu-15-1426173-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/6bb044c288c9/fimmu-15-1426173-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/3b89c37c5028/fimmu-15-1426173-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/55a5fdbf729a/fimmu-15-1426173-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/bae2db9e22e8/fimmu-15-1426173-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/edb042e5b021/fimmu-15-1426173-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/6bb044c288c9/fimmu-15-1426173-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/3b89c37c5028/fimmu-15-1426173-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/55a5fdbf729a/fimmu-15-1426173-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/bae2db9e22e8/fimmu-15-1426173-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3817/11361934/edb042e5b021/fimmu-15-1426173-g005.jpg

相似文献

1
TCR-H: explainable machine learning prediction of T-cell receptor epitope binding on unseen datasets.TCR-H:在未见数据集上解释性机器学习预测 T 细胞受体表位结合
Front Immunol. 2024 Aug 16;15:1426173. doi: 10.3389/fimmu.2024.1426173. eCollection 2024.
2
Predicting TCR sequences for unseen antigen epitopes using structural and sequence features.使用结构和序列特征预测未知抗原表位的 TCR 序列。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae210.
3
Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification.当前针对不可见表位 TCR 相互作用预测的挑战,以及源自图像分类的新视角。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa318.
4
On TCR binding predictors failing to generalize to unseen peptides.TCR 结合预测因子无法泛化到未见的肽。
Front Immunol. 2022 Oct 21;13:1014256. doi: 10.3389/fimmu.2022.1014256. eCollection 2022.
5
TULIP: A transformer-based unsupervised language model for interacting peptides and T cell receptors that generalizes to unseen epitopes.TULIP:一种基于转换器的无监督语言模型,用于与肽和 T 细胞受体相互作用,可推广到未见的表位。
Proc Natl Acad Sci U S A. 2024 Jun 11;121(24):e2316401121. doi: 10.1073/pnas.2316401121. Epub 2024 Jun 5.
6
ATM-TCR: TCR-Epitope Binding Affinity Prediction Using a Multi-Head Self-Attention Model.ATM-TCR:使用多头自注意力模型预测 TCR-表位结合亲和力。
Front Immunol. 2022 Jul 6;13:893247. doi: 10.3389/fimmu.2022.893247. eCollection 2022.
7
BERTrand-peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing.基于 Transformer 的双向编码表示与随机 TCR 配对增强的 Bertrand-肽:TCR 结合预测。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad468.
8
TSpred: a robust prediction framework for TCR-epitope interactions using paired chain TCR sequence data.TSpred:一种基于 TCR 序列配对数据的 TCR-表位相互作用的稳健预测框架。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae472.
9
GTE: a graph learning framework for prediction of T-cell receptors and epitopes binding specificity.GTE:用于预测 T 细胞受体和表位结合特异性的图学习框架。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae343.
10
Predicting T cell receptor functionality against mutant epitopes.预测针对突变表位的 T 细胞受体功能。
Cell Genom. 2024 Sep 11;4(9):100634. doi: 10.1016/j.xgen.2024.100634. Epub 2024 Aug 15.

引用本文的文献

1
AI-driven epitope prediction: a system review, comparative analysis, and practical guide for vaccine development.人工智能驱动的表位预测:疫苗开发的系统综述、比较分析及实用指南
NPJ Vaccines. 2025 Aug 30;10(1):207. doi: 10.1038/s41541-025-01258-y.
2
Artificial intelligence and machine learning in the development of vaccines and immunotherapeutics-yesterday, today, and tomorrow.人工智能与机器学习在疫苗和免疫疗法研发中的应用——过去、现在与未来
Front Artif Intell. 2025 Jul 18;8:1620572. doi: 10.3389/frai.2025.1620572. eCollection 2025.
3
The dawn of biophysical representations in computational immunology.

本文引用的文献

1
VitTCR: A deep learning method for peptide recognition prediction.VitTCR:一种用于肽识别预测的深度学习方法。
iScience. 2024 Apr 18;27(5):109770. doi: 10.1016/j.isci.2024.109770. eCollection 2024 May 17.
2
Immunology: Meta-learning for T cell-receptor binding specificity and beyond.免疫学:T细胞受体结合特异性及其他方面的元学习
Nat Mach Intell. 2023 Apr;5(4):337-339. doi: 10.1038/s42256-023-00641-5. Epub 2023 Mar 31.
3
EPIC-TRACE: predicting TCR binding to unseen epitopes using attention and contextualized embeddings.
计算免疫学中生物物理表征的曙光。
QRB Discov. 2025 May 28;6:e19. doi: 10.1017/qrd.2025.7. eCollection 2025.
4
T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging.消化系统癌症中的T细胞受体动力学:一种用于肿瘤诊断和分期的多层机器学习方法
Front Immunol. 2025 Apr 8;16:1556165. doi: 10.3389/fimmu.2025.1556165. eCollection 2025.
5
Origins of T-cell-mediated autoimmunity in acquired aplastic anaemia.获得性再生障碍性贫血中T细胞介导的自身免疫的起源。
Br J Haematol. 2025 Apr;206(4):1035-1053. doi: 10.1111/bjh.19993. Epub 2025 Jan 21.
6
Signals in the Cells: Multimodal and Contextualized Machine Learning Foundations for Therapeutics.细胞中的信号:治疗学的多模态与情境化机器学习基础
bioRxiv. 2024 Nov 12:2024.06.12.598655. doi: 10.1101/2024.06.12.598655.
EPIC-TRACE:使用注意力和上下文化嵌入来预测 TCR 与未见表位的结合。
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad743.
4
Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel.使用径向基函数核来解释支持向量机模型的精确夏普利值的计算。
Sci Rep. 2023 Nov 10;13(1):19561. doi: 10.1038/s41598-023-46930-2.
5
A transfer-learning approach to predict antigen immunogenicity and T-cell receptor specificity.一种用于预测抗原免疫原性和 T 细胞受体特异性的迁移学习方法。
Elife. 2023 Sep 8;12:e85126. doi: 10.7554/eLife.85126.
6
BERTrand-peptide:TCR binding prediction using Bidirectional Encoder Representations from Transformers augmented with random TCR pairing.基于 Transformer 的双向编码表示与随机 TCR 配对增强的 Bertrand-肽:TCR 结合预测。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad468.
7
What's the Catch? The Significance of Catch Bonds in T Cell Activation.有何玄机?细胞激活中“捕获键”的意义。
J Immunol. 2023 Aug 1;211(3):333-342. doi: 10.4049/jimmunol.2300141.
8
MITNet: a fusion transformer and convolutional neural network architecture approach for T-cell epitope prediction.MITNet:一种融合转换器和卷积神经网络架构的 T 细胞表位预测方法。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad202.
9
Antigen-specificity measurements are the key to understanding T cell responses.抗原特异性测量是理解 T 细胞反应的关键。
Front Immunol. 2023 Apr 14;14:1127470. doi: 10.3389/fimmu.2023.1127470. eCollection 2023.
10
epiTCR: a highly sensitive predictor for TCR-peptide binding.epiTCR:一种高灵敏度的 TCR-肽结合预测因子。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad284.