• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用变压器确定 T 细胞受体的表位特异性。

Determining epitope specificity of T-cell receptors with transformers.

机构信息

Department of Intelligent Systems, Delft University of Technology, Delft 2600 GA, The Netherlands.

Leiden Computational Biology Center, Department of Molecular Epidemiology, Leiden University Medical Center, Leiden 2333 ZA, The Netherlands.

出版信息

Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad632.

DOI:10.1093/bioinformatics/btad632
PMID:37847663
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10636277/
Abstract

SUMMARY

T-cell receptors (TCRs) on T cells recognize and bind to epitopes presented by the major histocompatibility complex in case of an infection or cancer. However, the high diversity of TCRs, as well as their unique and complex binding mechanisms underlying epitope recognition, make it difficult to predict the binding between TCRs and epitopes. Here, we present the utility of transformers, a deep learning strategy that incorporates an attention mechanism that learns the informative features, and show that these models pre-trained on a large set of protein sequences outperform current strategies. We compared three pre-trained auto-encoder transformer models (ProtBERT, ProtAlbert, and ProtElectra) and one pre-trained auto-regressive transformer model (ProtXLNet) to predict the binding specificity of TCRs to 25 epitopes from the VDJdb database (human and murine). Two additional modifications were performed to incorporate gene usage of the TCRs in the four transformer models. Of all 12 transformer implementations (four models with three different modifications), a modified version of the ProtXLNet model could predict TCR-epitope pairs with the highest accuracy (weighted F1 score 0.55 simultaneously considering all 25 epitopes). The modification included additional features representing the gene names for the TCRs. We also showed that the basic implementation of transformers outperformed the previously available methods, i.e. TCRGP, TCRdist, and DeepTCR, developed for the same biological problem, especially for the hard-to-classify labels. We show that the proficiency of transformers in attention learning can be made operational in a complex biological setting like TCR binding prediction. Further ingenuity in utilizing the full potential of transformers, either through attention head visualization or introducing additional features, can extend T-cell research avenues.

AVAILABILITY AND IMPLEMENTATION

Data and code are available on https://github.com/InduKhatri/tcrformer.

摘要

摘要

T 细胞上的 T 细胞受体 (TCR) 在感染或癌症时识别和结合主要组织相容性复合体呈现的表位。然而,TCR 的高度多样性,以及它们独特而复杂的识别表位的结合机制,使得预测 TCR 与表位之间的结合变得困难。在这里,我们展示了转换器的实用性,这是一种深度学习策略,它包含了一种注意力机制,可以学习有信息的特征,并表明这些在一大组蛋白质序列上进行预训练的模型优于当前的策略。我们比较了三种预先训练的自动编码器转换器模型(ProtBERT、ProtAlbert 和 ProtElectra)和一种预先训练的自动回归转换器模型(ProtXLNet),以预测来自 VDJdb 数据库(人类和鼠类)的 25 个表位的 TCR 结合特异性。在这四个转换器模型中,对基因使用情况进行了两次额外的修改。在所有 12 种转换器实现方式(四个模型各有三种不同的修改)中,经过修改的 ProtXLNet 模型可以预测 TCR-表位对的准确性最高(同时考虑所有 25 个表位的加权 F1 分数为 0.55)。该修改包括了表示 TCR 基因名称的额外特征。我们还表明,基础转换器的实现优于之前为解决同一生物学问题而开发的 TCRGP、TCRdist 和 DeepTCR 等可用方法,尤其是对于难以分类的标签。我们表明,转换器在注意力学习方面的熟练程度可以在 TCR 结合预测等复杂的生物学环境中得以实现。通过注意力头可视化或引入其他特征,进一步发挥转换器的全部潜力,可以拓展 T 细胞研究途径。

可用性和实现

数据和代码可在 https://github.com/InduKhatri/tcrformer 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/9e44d8b393ee/btad632f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/29179db05e11/btad632f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/2f1b920eac75/btad632f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/9e44d8b393ee/btad632f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/29179db05e11/btad632f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/2f1b920eac75/btad632f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0674/10636277/9e44d8b393ee/btad632f3.jpg

相似文献

1
Determining epitope specificity of T-cell receptors with transformers.用变压器确定 T 细胞受体的表位特异性。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad632.
2
Predicting TCR sequences for unseen antigen epitopes using structural and sequence features.使用结构和序列特征预测未知抗原表位的 TCR 序列。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae210.
3
EPIC-TRACE: predicting TCR binding to unseen epitopes using attention and contextualized embeddings.EPIC-TRACE:使用注意力和上下文化嵌入来预测 TCR 与未见表位的结合。
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad743.
4
Predicting recognition between T cell receptors and epitopes with TCRGP.使用 TCRGP 预测 T 细胞受体与表位之间的识别
PLoS Comput Biol. 2021 Mar 25;17(3):e1008814. doi: 10.1371/journal.pcbi.1008814. eCollection 2021 Mar.
5
TITAN: T-cell receptor specificity prediction with bimodal attention networks.TITAN:基于双模态注意力网络的 T 细胞受体特异性预测。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i237-i244. doi: 10.1093/bioinformatics/btab294.
6
Prediction of Specific TCR-Peptide Binding From Large Dictionaries of TCR-Peptide Pairs.从大型 TCR-肽对字典中预测特定 TCR-肽结合。
Front Immunol. 2020 Aug 25;11:1803. doi: 10.3389/fimmu.2020.01803. eCollection 2020.
7
The study of high-affinity TCRs reveals duality in T cell recognition of antigen: specificity and degeneracy.对高亲和力T细胞受体的研究揭示了T细胞对抗原识别的双重性:特异性和简并性。
J Immunol. 2006 Nov 15;177(10):6911-9. doi: 10.4049/jimmunol.177.10.6911.
8
Quantifiable predictive features define epitope-specific T cell receptor repertoires.可量化的预测特征定义了表位特异性T细胞受体库。
Nature. 2017 Jul 6;547(7661):89-93. doi: 10.1038/nature22383. Epub 2017 Jun 21.
9
An Attention Based Bidirectional LSTM Method to Predict the Binding of TCR and Epitope.一种基于注意力机制的双向长短期记忆网络方法用于预测TCR与表位的结合
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3272-3280. doi: 10.1109/TCBB.2021.3115353. Epub 2022 Dec 8.
10
TCRconv: predicting recognition between T cell receptors and epitopes using contextualized motifs.TCRconv:使用上下文相关基序预测 T 细胞受体和表位之间的识别
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac788.

本文引用的文献

1
NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCRα and β sequence data.NetTCR-2.0 通过使用配对的 TCRα 和β 序列数据实现了 TCR-肽结合的准确预测。
Commun Biol. 2021 Sep 10;4(1):1060. doi: 10.1038/s42003-021-02610-3.
2
TITAN: T-cell receptor specificity prediction with bimodal attention networks.TITAN:基于双模态注意力网络的 T 细胞受体特异性预测。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i237-i244. doi: 10.1093/bioinformatics/btab294.
3
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.
ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
4
Predicting recognition between T cell receptors and epitopes with TCRGP.使用 TCRGP 预测 T 细胞受体与表位之间的识别
PLoS Comput Biol. 2021 Mar 25;17(3):e1008814. doi: 10.1371/journal.pcbi.1008814. eCollection 2021 Mar.
5
DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires.DeepTCR 是一个深度学习框架,用于揭示 T 细胞受体库中的序列概念。
Nat Commun. 2021 Mar 11;12(1):1605. doi: 10.1038/s41467-021-21879-w.
6
SETE: Sequence-based Ensemble learning approach for TCR Epitope binding prediction.SETE:用于TCR表位结合预测的基于序列的集成学习方法。
Comput Biol Chem. 2020 Jun 20;87:107281. doi: 10.1016/j.compbiolchem.2020.107281.
7
Detection of Enriched T Cell Epitope Specificity in Full T Cell Receptor Sequence Repertoires.检测全 T 细胞受体序列库中富集的 T 细胞表位特异性。
Front Immunol. 2019 Nov 29;10:2820. doi: 10.3389/fimmu.2019.02820. eCollection 2019.
8
Detecting T cell receptors involved in immune responses from single repertoire snapshots.从单个免疫库快照中检测参与免疫反应的 T 细胞受体。
PLoS Biol. 2019 Jun 13;17(6):e3000314. doi: 10.1371/journal.pbio.3000314. eCollection 2019 Jun.
9
Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.
10
VDJdb: a curated database of T-cell receptor sequences with known antigen specificity.VDJdb:一个经策展的 T 细胞受体序列数据库,具有已知的抗原特异性。
Nucleic Acids Res. 2018 Jan 4;46(D1):D419-D427. doi: 10.1093/nar/gkx760.