VDJ重排和替换参数的一致性可实现准确的B细胞受体序列注释。

Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

作者信息

Ralph Duncan K, Matsen Frederick A

机构信息

Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.

出版信息

PLoS Comput Biol. 2016 Jan 11;12(1):e1004409. doi: 10.1371/journal.pcbi.1004409. eCollection 2016 Jan.

DOI:10.1371/journal.pcbi.1004409

PMID:26751373

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4709141/

Abstract

VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.

摘要

VDJ重排和体细胞超突变共同作用，产生针对多种抗原的抗体编码B细胞受体（BCR）序列。现在已经能够高通量地对这些BCR进行测序；对这些序列的分析为抗体的发育带来了新的见解，特别是对于针对HIV和流感的广泛中和抗体。这种序列分析的一个基本步骤是将每个碱基注释为来自V、D或J基因中的特定一个，或者来自N添加（也称为非模板插入）。先前的工作使用简单的参数分布来对VDJ重组的隐马尔可夫模型（HMM）中状态到状态的转换进行建模，并假设突变在各个位点通过相同的过程发生。然而，已经观察到密码子框架和其他效应违反了此类编码序列的这些参数假设，这表明采用非参数方法对重组过程进行建模可能会很有用。在我们的论文中，我们发现实际上大型现代数据集表明，对于HMM转移概率和每个等位基因每个位置的突变概率，使用富含参数的每个等位基因分类分布的模型，并且使用这样的模型进行推断会带来显著改进的结果。我们使用一种新颖的HMM“分解”策略提出了一个准确且高效的BCR序列注释软件包。这个名为partis（https://github.com/psathyrella/partis/）的软件包基于一个新的通用HMM编译器构建，该编译器在给定HMM的简单文本描述时可以执行高效的推断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1b4/4709141/3d7bd2926a9b/pcbi.1004409.g001.jpg

相似文献

Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

PLoS Comput Biol. 2016 Jan 11;12(1):e1004409. doi: 10.1371/journal.pcbi.1004409. eCollection 2016 Jan.

A Bayesian phylogenetic hidden Markov model for B cell receptor sequence analysis.

PLoS Comput Biol. 2020 Aug 17;16(8):e1008030. doi: 10.1371/journal.pcbi.1008030. eCollection 2020 Aug.

Using hidden Markov models and observed evolution to annotate viral genomes.

Bioinformatics. 2006 Jun 1;22(11):1308-16. doi: 10.1093/bioinformatics/btl092. Epub 2006 Apr 13.

BRILIA: Integrated Tool for High-Throughput Annotation and Lineage Tree Assembly of B-Cell Repertoires.

Front Immunol. 2017 Jan 17;7:681. doi: 10.3389/fimmu.2016.00681. eCollection 2016.

Likelihood-Based Inference of B Cell Clonal Families.

PLoS Comput Biol. 2016 Oct 17;12(10):e1005086. doi: 10.1371/journal.pcbi.1005086. eCollection 2016 Oct.

TRIg: a robust alignment pipeline for non-regular T-cell receptor and immunoglobulin sequences.

BMC Bioinformatics. 2016 Oct 26;17(1):433. doi: 10.1186/s12859-016-1304-2.

Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data.

PLoS Comput Biol. 2019 Jul 22;15(7):e1007133. doi: 10.1371/journal.pcbi.1007133. eCollection 2019 Jul.

Recovering probabilities for nucleotide trimming processes for T cell receptor TRA and TRG V-J junctions analyzed with IMGT tools.

BMC Bioinformatics. 2008 Oct 2;9:408. doi: 10.1186/1471-2105-9-408.

Statistical inference of the generation probability of T-cell receptors from sequence repertoires.

Proc Natl Acad Sci U S A. 2012 Oct 2;109(40):16161-6. doi: 10.1073/pnas.1212755109. Epub 2012 Sep 17.

Somatic hypermutation leads to diversification of the heavy chain immunoglobulin repertoire in cattle.

Vet Immunol Immunopathol. 2012 Jan 15;145(1-2):14-22. doi: 10.1016/j.vetimm.2011.10.001. Epub 2011 Oct 12.

引用本文的文献

Thrifty wide-context models of B cell receptor somatic hypermutation.

Elife. 2025 Aug 29;14:RP105471. doi: 10.7554/eLife.105471.

A Sitewise Model of Natural Selection on Individual Antibodies via a Transformer-Encoder.

Mol Biol Evol. 2025 Jul 30;42(8). doi: 10.1093/molbev/msaf186.

Nucleotide context models outperform protein language models for predicting antibody affinity maturation.

bioRxiv. 2025 Jun 18:2025.06.16.659977. doi: 10.1101/2025.06.16.659977.

Replaying germinal center evolution on a quantified affinity landscape.

bioRxiv. 2025 Jun 5:2025.06.02.656870. doi: 10.1101/2025.06.02.656870.

Enhancing sequence alignment of adaptive immune receptors through multi-task deep learning.

Nucleic Acids Res. 2025 Jul 8;53(13). doi: 10.1093/nar/gkaf651.

Applying phylogenetic methods for species delimitation to distinguish B-cell clonal families.

Front Immunol. 2024 Dec 2;15:1505032. doi: 10.3389/fimmu.2024.1505032. eCollection 2024.

Thrifty wide-context models of B cell receptor somatic hypermutation.

bioRxiv. 2025 May 19:2024.11.26.625407. doi: 10.1101/2024.11.26.625407.

An unbiased comparison of immunoglobulin sequence aligners.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae556.

Learning antibody sequence constraints from allelic inclusion.

bioRxiv. 2024 Oct 25:2024.10.22.619760. doi: 10.1101/2024.10.22.619760.

Evaluating methods for B-cell clonal family assignment.

bioRxiv. 2024 Jun 2:2024.05.29.596491. doi: 10.1101/2024.05.29.596491.

本文引用的文献

A Public Database of Memory and Naive B-Cell Receptor Sequences.

PLoS One. 2016 Aug 11;11(8):e0160853. doi: 10.1371/journal.pone.0160853. eCollection 2016.

Quantifying evolutionary constraints on B-cell affinity maturation.

Philos Trans R Soc Lond B Biol Sci. 2015 Sep 5;370(1676). doi: 10.1098/rstb.2014.0244.

Inferring processes underlying B-cell repertoire diversity.

Philos Trans R Soc Lond B Biol Sci. 2015 Sep 5;370(1676). doi: 10.1098/rstb.2014.0243.

IMSEQ--a fast and error aware approach to immunogenetic sequence analysis.

Bioinformatics. 2015 Sep 15;31(18):2963-71. doi: 10.1093/bioinformatics/btv309. Epub 2015 May 18.

MiXCR: software for comprehensive adaptive immunity profiling.

Nat Methods. 2015 May;12(5):380-1. doi: 10.1038/nmeth.3364.

Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles.

Proc Natl Acad Sci U S A. 2015 Feb 24;112(8):E862-70. doi: 10.1073/pnas.1417683112. Epub 2015 Feb 9.

The early history of B cells.

Nat Rev Immunol. 2015 Mar;15(3):191-7. doi: 10.1038/nri3801. Epub 2015 Feb 6.

In-depth determination and analysis of the human paired heavy- and light-chain antibody repertoire.

Nat Med. 2015 Jan;21(1):86-91. doi: 10.1038/nm.3743. Epub 2014 Dec 15.

Toward a more accurate view of human B-cell repertoire by next-generation sequencing, unbiased repertoire capture and single-molecule barcoding.

Sci Rep. 2014 Oct 27;4:6778. doi: 10.1038/srep06778.

Sequencing of the human IG light chain loci from a hydatidiform mole BAC library reveals locus-specific signatures of genetic diversity.

Genes Immun. 2015 Jan-Feb;16(1):24-34. doi: 10.1038/gene.2014.56. Epub 2014 Oct 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

VDJ重排和替换参数的一致性可实现准确的B细胞受体序列注释。

Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献