用于从单细胞数据中识别抗体-抗原特异性预测中的噪声的负二项混合模型。

Negative binomial mixture model for identification of noise in antibody-antigen specificity predictions from single-cell data.

作者信息

Wasdin Perry T, Abu-Shmais Alexandra A, Irvin Michael W, Vukovich Matthew J, Georgiev Ivelin S

机构信息

Program in Chemical and Physical Biology, Vanderbilt University Medical Center, Nashville, TN, 37232, United States.

Center for Computational Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, 37232, United States.

出版信息

Bioinform Adv. 2024 Dec 4;4(1):vbae170. doi: 10.1093/bioadv/vbae170. eCollection 2024.

DOI:10.1093/bioadv/vbae170

PMID:39659592

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11631427/

Abstract

MOTIVATION

LIBRA-seq (linking B cell receptor to antigen specificity by sequencing) provides a powerful tool for interrogating the antigen-specific B cell compartment and identifying antibodies against antigen targets of interest. Identification of noise in single-cell B cell receptor sequencing data, such as LIBRA-seq, is critical for improving antigen binding predictions for downstream applications including antibody discovery and machine learning technologies.

RESULTS

In this study, we present a method for denoising LIBRA-seq data by clustering antigen counts into signal and noise components with a negative binomial mixture model. This approach leverages single-cell sequencing reads from a large, multi-donor dataset described in a recent LIBRA-seq study to develop a data-driven means for identification of technical noise. We apply this method to nine donors representing separate LIBRA-seq experiments and show that our approach provides improved predictions for antibody-antigen binding when compared to the standard scoring method, despite variance in data size and noise structure across samples. This development will improve the ability of LIBRA-seq to identify antigen-specific B cells and contribute to providing more reliable datasets for machine learning based approaches as the corpus of single-cell B cell sequencing data continues to grow.

AVAILABILITY AND IMPLEMENTATION

All data and code are available at https://github.com/IGlab-VUMC/mixture_model_denoising.

摘要

动机

LIBRA-seq（通过测序将B细胞受体与抗原特异性联系起来）为研究抗原特异性B细胞区室和鉴定针对感兴趣抗原靶点的抗体提供了一个强大的工具。识别单细胞B细胞受体测序数据中的噪声，如LIBRA-seq中的噪声，对于改进包括抗体发现和机器学习技术在内的下游应用的抗原结合预测至关重要。

结果

在本研究中，我们提出了一种通过使用负二项混合模型将抗原计数聚类为信号和噪声成分来对LIBRA-seq数据进行去噪的方法。这种方法利用了来自最近一项LIBRA-seq研究中描述的一个大型多供体数据集的单细胞测序读数，以开发一种数据驱动的技术噪声识别方法。我们将这种方法应用于代表不同LIBRA-seq实验的九个供体，并表明与标准评分方法相比，我们的方法在抗体-抗原结合预测方面有改进，尽管样本间数据大小和噪声结构存在差异。随着单细胞B细胞测序数据量的不断增加，这一进展将提高LIBRA-seq识别抗原特异性B细胞的能力，并有助于为基于机器学习的方法提供更可靠的数据集。

可用性和实现方式

所有数据和代码可在https://github.com/IGlab-VUMC/mixture_model_denoising获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9f1/11631427/b44ec2c13ba6/vbae170f1.jpg

相似文献

Negative binomial mixture model for identification of noise in antibody-antigen specificity predictions from single-cell data.用于从单细胞数据中识别抗体-抗原特异性预测中的噪声的负二项混合模型。

Bioinform Adv. 2024 Dec 4;4(1):vbae170. doi: 10.1093/bioadv/vbae170. eCollection 2024.

Negative Binomial Mixture Model for Identification of Noise in Antigen-Specificity Predictions by LIBRA-seq.用于通过LIBRA-seq鉴定抗原特异性预测中噪声的负二项混合模型。

bioRxiv. 2023 Oct 19:2023.10.13.562258. doi: 10.1101/2023.10.13.562258.

High-Throughput Mapping of B Cell Receptor Sequences to Antigen Specificity.高通量 B 细胞受体序列到抗原特异性的映射。

Cell. 2019 Dec 12;179(7):1636-1646.e15. doi: 10.1016/j.cell.2019.11.003. Epub 2019 Nov 28.

Rapid isolation and immune profiling of SARS-CoV-2 specific memory B cell in convalescent COVID-19 patients via LIBRA-seq.通过 LIBRA-seq 技术从康复期 COVID-19 患者中快速分离和免疫分析 SARS-CoV-2 特异性记忆 B 细胞。

Signal Transduct Target Ther. 2021 May 17;6(1):195. doi: 10.1038/s41392-021-00610-7.

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

Development of LIBRA-seq for the guinea pig model system as a tool for the evaluation of antibody responses to multivalent HIV-1 vaccines.开发用于豚鼠模型系统的 LIBRA-seq 技术，作为评估多价 HIV-1 疫苗抗体反应的工具。

J Virol. 2024 Jan 23;98(1):e0147823. doi: 10.1128/jvi.01478-23. Epub 2023 Dec 12.

scGCL: an imputation method for scRNA-seq data based on graph contrastive learning.scGCL：一种基于图对比学习的 scRNA-seq 数据插补方法。

Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad098.

Antibody sequence determinants of viral antigen specificity.病毒抗原特异性的抗体序列决定簇。

mBio. 2024 Oct 16;15(10):e0156024. doi: 10.1128/mbio.01560-24. Epub 2024 Sep 12.

A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。

PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.

Transfer learning for clustering single-cell RNA-seq data crossing-species and batch, case on uterine fibroids.跨物种和批次的单细胞 RNA-seq 数据聚类的迁移学习：以子宫肌瘤为例。

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad426.

本文引用的文献

Antibody sequence determinants of viral antigen specificity.病毒抗原特异性的抗体序列决定簇。

mBio. 2024 Oct 16;15(10):e0156024. doi: 10.1128/mbio.01560-24. Epub 2024 Sep 12.

The Patent and Literature Antibody Database (PLAbDab): an evolving reference set of functionally diverse, literature-annotated antibody sequences and structures.专利和文献抗体数据库（PLAbDab）：一个不断发展的、具有多种功能的参考集，包含经过文献注释的抗体序列和结构。

Nucleic Acids Res. 2024 Jan 5;52(D1):D545-D551. doi: 10.1093/nar/gkad1056.

Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender.基于 CellBender 的无监督去除液滴式单细胞实验系统背景噪声。

Nat Methods. 2023 Sep;20(9):1323-1335. doi: 10.1038/s41592-023-01943-7. Epub 2023 Aug 7.

Memory B cells.记忆B细胞。

Nat Rev Immunol. 2024 Jan;24(1):5-17. doi: 10.1038/s41577-023-00897-3. Epub 2023 Jul 3.

Functional HIV-1/HCV cross-reactive antibodies isolated from a chronically co-infected donor.从一名慢性合并感染供者中分离到的功能性 HIV-1/HCV 交叉反应性抗体。

Cell Rep. 2023 Feb 28;42(2):112044. doi: 10.1016/j.celrep.2023.112044. Epub 2023 Jan 27.

Single-cell profiling of the antigen-specific response to BNT162b2 SARS-CoV-2 RNA vaccine.单细胞分析 BNT162b2 SARS-CoV-2 RNA 疫苗的抗原特异性反应。

Nat Commun. 2022 Jun 16;13(1):3466. doi: 10.1038/s41467-022-31142-5.

Normalizing and denoising protein expression data from droplet-based single cell profiling.基于液滴的单细胞分析的蛋白质表达数据的标准化和去噪。

Nat Commun. 2022 Apr 19;13(1):2099. doi: 10.1038/s41467-022-29356-8.

High-Throughput B Cell Epitope Determination by Next-Generation Sequencing.高通量 B 细胞表位测定的下一代测序技术。

Front Immunol. 2022 Mar 23;13:855772. doi: 10.3389/fimmu.2022.855772. eCollection 2022.

Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies.基于机器学习的定制化单克隆抗体设计的进展与挑战。

MAbs. 2022 Jan-Dec;14(1):2008790. doi: 10.1080/19420862.2021.2008790.

Efficient discovery of SARS-CoV-2-neutralizing antibodies via B cell receptor sequencing and ligand blocking.通过 B 细胞受体测序和配体阻断高效发现 SARS-CoV-2 中和抗体。

Nat Biotechnol. 2022 Aug;40(8):1270-1275. doi: 10.1038/s41587-022-01232-2. Epub 2022 Mar 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于从单细胞数据中识别抗体-抗原特异性预测中的噪声的负二项混合模型。

Negative binomial mixture model for identification of noise in antibody-antigen specificity predictions from single-cell data.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现方式

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献