• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于语言模型的 B 细胞受体序列嵌入可以有效地编码受体特异性。

Language model-based B cell receptor sequence embeddings can effectively encode receptor specificity.

机构信息

Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA.

Program in Applied Mathematics, Yale University, New Haven, CT, USA.

出版信息

Nucleic Acids Res. 2024 Jan 25;52(2):548-557. doi: 10.1093/nar/gkad1128.

DOI:10.1093/nar/gkad1128
PMID:38109302
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10810273/
Abstract

High throughput sequencing of B cell receptors (BCRs) is increasingly applied to study the immense diversity of antibodies. Learning biologically meaningful embeddings of BCR sequences is beneficial for predictive modeling. Several embedding methods have been developed for BCRs, but no direct performance benchmarking exists. Moreover, the impact of the input sequence length and paired-chain information on the prediction remains to be explored. We evaluated the performance of multiple embedding models to predict BCR sequence properties and receptor specificity. Despite the differences in model architectures, most embeddings effectively capture BCR sequence properties and specificity. BCR-specific embeddings slightly outperform general protein language models in predicting specificity. In addition, incorporating full-length heavy chains and paired light chain sequences improves the prediction performance of all embeddings. This study provides insights into the properties of BCR embeddings to improve downstream prediction applications for antibody analysis and discovery.

摘要

高通量测序 B 细胞受体 (BCR) 越来越多地应用于研究抗体的巨大多样性。学习 BCR 序列的生物学有意义的嵌入对于预测建模是有益的。已经开发了几种用于 BCR 的嵌入方法,但不存在直接的性能基准测试。此外,输入序列长度和配对链信息对预测的影响仍有待探索。我们评估了多种嵌入模型在预测 BCR 序列特性和受体特异性方面的性能。尽管模型架构存在差异,但大多数嵌入有效地捕获了 BCR 序列特性和特异性。在预测特异性方面,BCR 特异性嵌入略优于一般蛋白质语言模型。此外,包含全长重链和配对轻链序列可提高所有嵌入的预测性能。这项研究深入了解了 BCR 嵌入的特性,以改善抗体分析和发现的下游预测应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/b875d9d22251/gkad1128fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/d30f3e0228e6/gkad1128figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/127a2a7193dc/gkad1128fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/d09bad58d0c7/gkad1128fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/4a1473461141/gkad1128fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/b875d9d22251/gkad1128fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/d30f3e0228e6/gkad1128figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/127a2a7193dc/gkad1128fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/d09bad58d0c7/gkad1128fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/4a1473461141/gkad1128fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d190/10810273/b875d9d22251/gkad1128fig4.jpg

相似文献

1
Language model-based B cell receptor sequence embeddings can effectively encode receptor specificity.基于语言模型的 B 细胞受体序列嵌入可以有效地编码受体特异性。
Nucleic Acids Res. 2024 Jan 25;52(2):548-557. doi: 10.1093/nar/gkad1128.
2
Locality-aware pooling enhances protein language model performance across varied applications.局部感知池化可提升蛋白质语言模型在各种应用中的性能。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i217-i226. doi: 10.1093/bioinformatics/btaf178.
3
Psychometric Evaluation of Large Language Model Embeddings for Personality Trait Prediction.用于人格特质预测的大语言模型嵌入的心理测量评估
J Med Internet Res. 2025 Jul 8;27:e75347. doi: 10.2196/75347.
4
Short-Term Memory Impairment短期记忆障碍
5
G2VTCR: predicting antigen binding specificity by Weisfeiler-Lehman graph embedding of T cell receptor sequences.G2VTCR:通过T细胞受体序列的魏斯费勒-莱曼图嵌入预测抗原结合特异性
bioRxiv. 2025 May 4:2025.04.29.651344. doi: 10.1101/2025.04.29.651344.
6
Assessing the comparative effects of interventions in COPD: a tutorial on network meta-analysis for clinicians.评估慢性阻塞性肺疾病干预措施的比较效果:面向临床医生的网状Meta分析教程
Respir Res. 2024 Dec 21;25(1):438. doi: 10.1186/s12931-024-03056-x.
7
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
8
Nucleotide context models outperform protein language models for predicting antibody affinity maturation.在预测抗体亲和力成熟方面,核苷酸上下文模型优于蛋白质语言模型。
bioRxiv. 2025 Jun 18:2025.06.16.659977. doi: 10.1101/2025.06.16.659977.
9
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
10
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

引用本文的文献

1
Optimizing the breadth of SARS-CoV-2-neutralizing antibodies in vivo and in silico.在体内和计算机模拟中优化新冠病毒中和抗体的广度
Hum Vaccin Immunother. 2025 Dec;21(1):2526873. doi: 10.1080/21645515.2025.2526873. Epub 2025 Jul 21.
2
Enhancing sequence alignment of adaptive immune receptors through multi-task deep learning.通过多任务深度学习增强适应性免疫受体的序列比对
Nucleic Acids Res. 2025 Jul 8;53(13). doi: 10.1093/nar/gkaf651.
3
Supervised fine-tuning of pre-trained antibody language models improves antigen specificity prediction.

本文引用的文献

1
Improving antibody language models with native pairing.通过天然配对改进抗体语言模型。
Patterns (N Y). 2024 Apr 4;5(5):100967. doi: 10.1016/j.patter.2024.100967. eCollection 2024 May 10.
2
The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires.用于适应性免疫受体库机器学习分析的immuneML生态系统。
Nat Mach Intell. 2021 Nov;3(11):936-944. doi: 10.1038/s42256-021-00413-z. Epub 2021 Nov 16.
3
High-throughput single-cell profiling of B cell responses following inactivated influenza vaccination in young and older adults.
预训练抗体语言模型的监督微调可提高抗原特异性预测能力。
PLoS Comput Biol. 2025 Mar 31;21(3):e1012153. doi: 10.1371/journal.pcbi.1012153. eCollection 2025 Mar.
4
Unlocking precision medicine: clinical applications of integrating health records, genetics, and immunology through artificial intelligence.开启精准医学:通过人工智能整合健康记录、遗传学和免疫学的临床应用
J Biomed Sci. 2025 Feb 7;32(1):16. doi: 10.1186/s12929-024-01110-w.
5
The Type 1 Diabetes T Cell Receptor and B Cell Receptor Repository in the AIRR Data Commons: a practical guide for access, use and contributions through the Type 1 Diabetes AIRR Consortium.AIRR数据共享库中的1型糖尿病T细胞受体和B细胞受体资源库:通过1型糖尿病AIRR联盟进行访问、使用和贡献的实用指南。
Diabetologia. 2025 Jan;68(1):186-202. doi: 10.1007/s00125-024-06298-y. Epub 2024 Oct 29.
6
AI and immunology as a new research paradigm.人工智能与免疫学作为一种新的研究范式。
Nat Immunol. 2024 Nov;25(11):1993-1996. doi: 10.1038/s41590-024-01974-y.
7
RAIN: machine learning-based identification for HIV-1 bNAbs.基于机器学习的 HIV-1 广谱中和抗体鉴定
Nat Commun. 2024 Jun 24;15(1):5339. doi: 10.1038/s41467-024-49676-1.
8
RAIN: a Machine Learning-based identification for HIV-1 bNAbs.RAIN:一种基于机器学习的HIV-1广谱中和抗体识别方法。
Res Sq. 2024 Mar 8:rs.3.rs-4023897. doi: 10.21203/rs.3.rs-4023897/v1.
9
Supervised fine-tuning of pre-trained antibody language models improves antigen specificity prediction.预训练抗体语言模型的监督微调可提高抗原特异性预测能力。
bioRxiv. 2024 May 13:2024.05.13.593807. doi: 10.1101/2024.05.13.593807.
高通量单细胞分析技术在年轻人和老年人中研究灭活流感疫苗接种后的 B 细胞反应。
Aging (Albany NY). 2023 Jun 26;15(18):9250-9274. doi: 10.18632/aging.204778.
4
AbLang: an antibody language model for completing antibody sequences.AbLang:一种用于完成抗体序列的抗体语言模型。
Bioinform Adv. 2022 Jun 17;2(1):vbac046. doi: 10.1093/bioadv/vbac046. eCollection 2022.
5
Adaptive immune responses to SARS-CoV-2 persist in the pharyngeal lymphoid tissue of children.儿童咽淋巴组织中存在针对 SARS-CoV-2 的适应性免疫应答。
Nat Immunol. 2023 Jan;24(1):186-199. doi: 10.1038/s41590-022-01367-z. Epub 2022 Dec 19.
6
Deciphering the language of antibodies using self-supervised learning.利用自监督学习破解抗体语言。
Patterns (N Y). 2022 May 18;3(7):100513. doi: 10.1016/j.patter.2022.100513. eCollection 2022 Jul 8.
7
Learning meaningful representations of protein sequences.学习蛋白质序列有意义的表示方法。
Nat Commun. 2022 Apr 8;13(1):1914. doi: 10.1038/s41467-022-29443-w.
8
Germinal centre-driven maturation of B cell response to mRNA vaccination.mRNA 疫苗接种中 B 细胞反应的生发中心驱动成熟。
Nature. 2022 Apr;604(7904):141-145. doi: 10.1038/s41586-022-04527-1. Epub 2022 Feb 15.
9
Single-cell multi-omics reveals dyssynchrony of the innate and adaptive immune system in progressive COVID-19.单细胞多组学揭示了进展期新冠肺炎中固有免疫系统和适应性免疫系统的不同步性。
Nat Commun. 2022 Jan 21;13(1):440. doi: 10.1038/s41467-021-27716-4.
10
Observed Antibody Space: A diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences.观察到的抗体空间:一个多样化的数据库,包含经过清理、注释和翻译的未配对和配对抗体序列。
Protein Sci. 2022 Jan;31(1):141-146. doi: 10.1002/pro.4205. Epub 2021 Oct 29.