• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于评估TCR嵌入在模拟TCR-表位相互作用中的综合基准测试。

A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.

作者信息

Feng Xikang, Huo Miaozhe, Li He, Yang Yongze, Jiang Yuepeng, He Liang, Cheng Li Shuai

机构信息

School of Software, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an Shaanxi, 710072, China.

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon Tong, Hong Kong, 999077, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf030.

DOI:10.1093/bib/bbaf030
PMID:39883514
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11781202/
Abstract

The complexity of T cell receptor (TCR) sequences, particularly within the complementarity-determining region 3 (CDR3), requires efficient embedding methods for applying machine learning to immunology. While various TCR CDR3 embedding strategies have been proposed, the absence of their systematic evaluations created perplexity in the community. Here, we extracted CDR3 embedding models from 19 existing methods and benchmarked these models with four curated datasets by accessing their impact on the performance of TCR downstream tasks, including TCR-epitope binding affinity prediction, epitope-specific TCR identification, TCR clustering, and visualization analysis. We assessed these models utilizing eight downstream classifiers and five downstream clustering methods, with the performance measured by a diverse range of metrics for precision, robustness, and usability. Overall, handcrafted embeddings outperformed data-driven ones in modeling TCR-epitope interactions. To further refine our comparative findings, we developed an all-in-one TCR CDR3 embedding package comprising all evaluated embedding models. This package will assist users in easily selecting suitable embedding models for their data.

摘要

T细胞受体(TCR)序列的复杂性,尤其是在互补决定区3(CDR3)内,需要有效的嵌入方法才能将机器学习应用于免疫学。虽然已经提出了各种TCR CDR3嵌入策略,但缺乏系统的评估在该领域造成了困惑。在这里,我们从19种现有方法中提取了CDR3嵌入模型,并通过访问它们对TCR下游任务性能的影响,用四个经过整理的数据集对这些模型进行了基准测试,这些下游任务包括TCR-表位结合亲和力预测、表位特异性TCR识别、TCR聚类和可视化分析。我们利用八个下游分类器和五种下游聚类方法评估了这些模型,其性能通过一系列用于精度、稳健性和可用性的指标来衡量。总体而言,在模拟TCR-表位相互作用方面,手工制作的嵌入方法优于数据驱动的方法。为了进一步完善我们的比较结果,我们开发了一个一体化的TCR CDR3嵌入软件包,其中包含所有评估过的嵌入模型。这个软件包将帮助用户轻松地为他们的数据选择合适的嵌入模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/6a005cb70111/bbaf030f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/dbc4a8458130/bbaf030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/a087d8bd3b64/bbaf030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/c5160045ff67/bbaf030f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/2a6b676ae2be/bbaf030f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/ed0d8f80fdaf/bbaf030f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/6a005cb70111/bbaf030f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/dbc4a8458130/bbaf030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/a087d8bd3b64/bbaf030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/c5160045ff67/bbaf030f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/2a6b676ae2be/bbaf030f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/ed0d8f80fdaf/bbaf030f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/266b/11781202/6a005cb70111/bbaf030f6.jpg

相似文献

1
A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.一种用于评估TCR嵌入在模拟TCR-表位相互作用中的综合基准测试。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf030.
2
tcrBLOSUM: an amino acid substitution matrix for sensitive alignment of distant epitope-specific TCRs.tcrBLOSUM:一种氨基酸替换矩阵,用于灵敏比对远距离表位特异性 TCR。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae602.
3
Predicting TCR sequences for unseen antigen epitopes using structural and sequence features.使用结构和序列特征预测未知抗原表位的 TCR 序列。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae210.
4
EpicPred: predicting phenotypes driven by epitope-binding TCRs using attention-based multiple instance learning.EpicPred:使用基于注意力的多示例学习预测由表位结合TCR驱动的表型。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf080.
5
Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification.当前针对不可见表位 TCR 相互作用预测的挑战,以及源自图像分类的新视角。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa318.
6
TSpred: a robust prediction framework for TCR-epitope interactions using paired chain TCR sequence data.TSpred:一种基于 TCR 序列配对数据的 TCR-表位相互作用的稳健预测框架。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae472.
7
Comprehensive Analysis of CDR3 Sequences in Gluten-Specific T-Cell Receptors Reveals a Dominant R-Motif and Several New Minor Motifs.全面分析麦胶蛋白特异性 T 细胞受体的 CDR3 序列揭示一个主要的 R 基序和几个新的次要基序。
Front Immunol. 2021 Apr 13;12:639672. doi: 10.3389/fimmu.2021.639672. eCollection 2021.
8
Investigating TCR-pMHC interactions for TCRs without identified epitopes by constructing a computational pipeline.通过构建计算流程来研究未鉴定表位的TCR的TCR-pMHC相互作用。
Int J Biol Macromol. 2024 Dec;282(Pt 1):136502. doi: 10.1016/j.ijbiomac.2024.136502. Epub 2024 Oct 18.
9
T-Cell Receptor Cognate Target Prediction Based on Paired α and β Chain Sequence and Structural CDR Loop Similarities.基于α和β链序列以及结构 CDR 环相似性的 T 细胞受体同源性靶标预测。
Front Immunol. 2019 Aug 28;10:2080. doi: 10.3389/fimmu.2019.02080. eCollection 2019.
10
TCR-H: explainable machine learning prediction of T-cell receptor epitope binding on unseen datasets.TCR-H:在未见数据集上解释性机器学习预测 T 细胞受体表位结合
Front Immunol. 2024 Aug 16;15:1426173. doi: 10.3389/fimmu.2024.1426173. eCollection 2024.

本文引用的文献

1
The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires.用于适应性免疫受体库机器学习分析的immuneML生态系统。
Nat Mach Intell. 2021 Nov;3(11):936-944. doi: 10.1038/s42256-021-00413-z. Epub 2021 Nov 16.
2
Quantitative annotations of T-Cell repertoire specificity.T 细胞受体库特异性的定量注释。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad175.
3
TEINet: a deep learning framework for prediction of TCR-epitope binding specificity.TEINet:一种用于预测TCR-表位结合特异性的深度学习框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad086.
4
Can we predict T cell specificity with digital biology and machine learning?我们能否通过数字生物学和机器学习来预测 T 细胞特异性?
Nat Rev Immunol. 2023 Aug;23(8):511-521. doi: 10.1038/s41577-023-00835-3. Epub 2023 Feb 8.
5
Deep autoregressive generative models capture the intrinsics embedded in T-cell receptor repertoires.深度自回归生成模型捕捉了嵌入在T细胞受体库中的内在特征。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad038.
6
Deep learning reveals predictive sequence concepts within immune repertoires to immunotherapy.深度学习揭示免疫组库中对免疫疗法具有预测性的序列概念。
Sci Adv. 2022 Sep 16;8(37):eabq5089. doi: 10.1126/sciadv.abq5089.
7
Deep learning-based prediction of the T cell receptor-antigen binding specificity.基于深度学习的T细胞受体-抗原结合特异性预测
Nat Mach Intell. 2021 Oct;3(10):864-875. doi: 10.1038/s42256-021-00383-2. Epub 2021 Sep 23.
8
VDJdb in the pandemic era: a compendium of T cell receptors specific for SARS-CoV-2.大流行时代的VDJdb:SARS-CoV-2特异性T细胞受体汇编
Nat Methods. 2022 Sep;19(9):1017-1019. doi: 10.1038/s41592-022-01578-0.
9
ATM-TCR: TCR-Epitope Binding Affinity Prediction Using a Multi-Head Self-Attention Model.ATM-TCR:使用多头自注意力模型预测 TCR-表位结合亲和力。
Front Immunol. 2022 Jul 6;13:893247. doi: 10.3389/fimmu.2022.893247. eCollection 2022.
10
TCR meta-clonotypes for biomarker discovery with enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs.利用 TCR 元克隆型进行生物标志物发现, 能够识别 SARS-CoV-2 TCR 的公共、HLA 限制簇。
Elife. 2021 Nov 30;10:e68605. doi: 10.7554/eLife.68605.