454抗体测序——错误特征分析与校正

454 antibody sequencing - error characterization and correction.

作者信息

Prabakaran Ponraj, Streaker Emily, Chen Weizao, Dimitrov Dimiter S

机构信息

Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), Frederick, MD 21702-1201, USA.

出版信息

BMC Res Notes. 2011 Oct 12;4:404. doi: 10.1186/1756-0500-4-404.

DOI:10.1186/1756-0500-4-404

PMID:21992227

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3228814/

Abstract

BACKGROUND

454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing.

RESULTS

We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes).

CONCLUSIONS

The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.

摘要

背景

454测序目前是抗体库测序的首选方法，这些抗体库包含大量（106至1012）具有相似框架和可变区的不同分子，这给识别测序错误带来了重大挑战。识别和纠正此类混合物中的测序错误对于探索复杂的成熟途径以及识别高度体细胞突变抗体的推定种系前体尤为重要。为了量化和纠正454抗体测序中引入的错误，我们对六种不同已知浓度的抗体进行了两次重复测序，并将其与通过标准桑格测序确定的相应已知序列进行比较。

结果

我们发现，454抗体测序可能会导致约20%的错误读数，这些错误主要是由于插入造成的，大多出现在长度为2 - 3个核苷酸的较短同聚物区域，随机位点的插入、缺失和其他变体导致的错误较少。错误校正可能会将这些错误读数的比例降低至5 - 10%。然而，仍有一定数量（占总读数的4 - 8%）的错误无法校正，除非进行多次重复测序，尽管对于包含完整抗体集（抗体组）的大型多样文库和库来说这可能无法实现。

结论

为评估454抗体测序错误而进行的实验测试程序显示，错误读数比例较高（高达20%）；错误可减少至5 - 10%，但无法更低，这表明在避免错误发现抗体变体和多样性时需谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec59/3228814/b1b32fa50488/1756-0500-4-404-1.jpg

相似文献

454 antibody sequencing - error characterization and correction.454抗体测序——错误特征分析与校正

BMC Res Notes. 2011 Oct 12;4:404. doi: 10.1186/1756-0500-4-404.

[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]

Yi Chuan Xue Bao. 2004 May;31(5):431-43.

Correction of sequencing errors in a mixed set of reads.纠正混合读取集中的测序错误。

Bioinformatics. 2010 May 15;26(10):1284-90. doi: 10.1093/bioinformatics/btq151. Epub 2010 Apr 8.

Identification of errors introduced during high throughput sequencing of the T cell receptor repertoire.高通量测序 T 细胞受体库时引入的错误鉴定。

BMC Genomics. 2011 Feb 11;12:106. doi: 10.1186/1471-2164-12-106.

Pacific Biosciences Sequencing and IMGT/HighV-QUEST Analysis of Full-Length Single Chain Fragment Variable from an Selected Phage-Display Combinatorial Library.来自选定噬菌体展示组合文库的全长单链抗体可变区的太平洋生物科学公司测序及IMGT/HighV-QUEST分析

Front Immunol. 2017 Dec 20;8:1796. doi: 10.3389/fimmu.2017.01796. eCollection 2017.

Iterative error correction of long sequencing reads maximizes accuracy and improves contig assembly.长测序读段的迭代纠错可最大化准确性并改善重叠群组装。

Brief Bioinform. 2017 Jan;18(1):1-8. doi: 10.1093/bib/bbw003. Epub 2016 Feb 10.

In search of perfect reads.寻找完美的读数。

BMC Bioinformatics. 2015;16 Suppl 17(Suppl 17):S7. doi: 10.1186/1471-2105-16-S17-S7. Epub 2015 Dec 7.

MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序：一种合成方法。

Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.

Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.Illumina 纠错技术在高度重复 DNA 区域的应用提高了从头基因组组装的质量。

BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.

Blue: correcting sequencing errors using consensus and context.蓝色：使用一致性和上下文来纠正测序错误。

Bioinformatics. 2014 Oct;30(19):2723-32. doi: 10.1093/bioinformatics/btu368. Epub 2014 Jun 11.

引用本文的文献

Bioinformatic Analysis of Natively Paired VH:VL Antibody Repertoires for Antibody Discovery.用于抗体发现的天然配对 VH:VL 抗体库的生物信息学分析。

Methods Mol Biol. 2023;2552:447-463. doi: 10.1007/978-1-0716-2609-2_25.

Combined Influence of B-Cell Receptor Rearrangement and Somatic Hypermutation on B-Cell Class-Switch Fate in Health and in Chronic Lymphocytic Leukemia.B 细胞受体重排和体细胞超突变对健康和慢性淋巴细胞白血病中 B 细胞类别转换命运的综合影响。

Front Immunol. 2018 Aug 10;9:1784. doi: 10.3389/fimmu.2018.01784. eCollection 2018.

The analysis of clonal expansions in normal and autoimmune B cell repertoires.正常和自身免疫性B细胞库中克隆扩增的分析。

Philos Trans R Soc Lond B Biol Sci. 2015 Sep 5;370(1676). doi: 10.1098/rstb.2014.0239.

Immunoglobulin class-switched B cells form an active immune axis between CNS and periphery in multiple sclerosis.免疫球蛋白类别转换的B细胞在多发性硬化症中形成了中枢神经系统与外周之间的活跃免疫轴。

Sci Transl Med. 2014 Aug 6;6(248):248ra106. doi: 10.1126/scitranslmed.3008930.

Analysis of plant microbe interactions in the era of next generation sequencing technologies.下一代测序技术时代的植物-微生物相互作用分析

Front Plant Sci. 2014 May 21;5:216. doi: 10.3389/fpls.2014.00216. eCollection 2014.

The promise and challenge of high-throughput sequencing of the antibody repertoire.高通量测序抗体库的前景与挑战。

Nat Biotechnol. 2014 Feb;32(2):158-68. doi: 10.1038/nbt.2782. Epub 2014 Jan 19.

Discovery of novel candidate therapeutics and diagnostics based on engineered human antibody domains.基于工程化人抗体结构域发现新型候选治疗药物和诊断方法。

Curr Drug Discov Technol. 2014 Mar;11(1):28-40. doi: 10.2174/15701638113109990032.

Routine performance and errors of 454 HLA exon sequencing in diagnostics.454 HLA 外显子测序在诊断中的常规性能和误差。

BMC Bioinformatics. 2013 Jun 3;14:176. doi: 10.1186/1471-2105-14-176.

Human monoclonal antibodies as candidate therapeutics against emerging viruses and HIV-1.人源单克隆抗体作为针对新兴病毒和 HIV-1 的候选治疗药物。

Virol Sin. 2013 Apr;28(2):71-80. doi: 10.1007/s12250-013-3313-x. Epub 2013 Apr 11.

Mining the antibodyome for HIV-1-neutralizing antibodies with next-generation sequencing and phylogenetic pairing of heavy/light chains.利用下一代测序和重/轻链的系统发育配对挖掘 HIV-1 中和抗体的抗体组。

Proc Natl Acad Sci U S A. 2013 Apr 16;110(16):6470-5. doi: 10.1073/pnas.1219320110. Epub 2013 Mar 27.

本文引用的文献

Identification of errors introduced during high throughput sequencing of the T cell receptor repertoire.高通量测序 T 细胞受体库时引入的错误鉴定。

BMC Genomics. 2011 Feb 11;12:106. doi: 10.1186/1471-2164-12-106.

HiTEC: accurate error correction in high-throughput sequencing data.HiTEC：高通量测序数据中的精确错误校正。

Bioinformatics. 2011 Feb 1;27(3):295-302. doi: 10.1093/bioinformatics/btq653. Epub 2010 Nov 26.

SAMStat: monitoring biases in next generation sequencing data.SAMStat：监测下一代测序数据中的偏倚。

Bioinformatics. 2011 Jan 1;27(1):130-1. doi: 10.1093/bioinformatics/btq614. Epub 2010 Nov 18.

Individual variation in the germline Ig gene repertoire inferred from variable region gene rearrangements.从可变区基因重排推断生殖系 Ig 基因库中的个体变异。

J Immunol. 2010 Jun 15;184(12):6986-92. doi: 10.4049/jimmunol.1000445. Epub 2010 May 21.

High-throughput DNA sequencing--concepts and limitations.高通量 DNA 测序--概念与局限。

Bioessays. 2010 Jun;32(6):524-36. doi: 10.1002/bies.200900181.

Therapeutic antibodies, vaccines and antibodyomes.治疗性抗体、疫苗和抗体组学。

MAbs. 2010 May-Jun;2(3):347-56. doi: 10.4161/mabs.2.3.11779. Epub 2010 May 14.

Measurement and clinical monitoring of human lymphocyte clonality by massively parallel VDJ pyrosequencing.通过大规模平行 VDJ 焦磷酸测序测量和临床监测人类淋巴细胞克隆性。

Sci Transl Med. 2009 Dec 23;1(12):12ra23. doi: 10.1126/scitranslmed.3000540.

Precise determination of the diversity of a combinatorial antibody library gives insight into the human immunoglobulin repertoire.精确测定组合抗体文库的多样性可深入了解人类免疫球蛋白库。

Proc Natl Acad Sci U S A. 2009 Dec 1;106(48):20216-21. doi: 10.1073/pnas.0909775106. Epub 2009 Oct 29.

IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis.IMGT/V-QUEST：用于免疫球蛋白（IG）和T细胞受体（TR）标准化V-J和V-D-J序列分析的高度定制化集成系统。

Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W503-8. doi: 10.1093/nar/gkn316. Epub 2008 May 24.

Identification of common molecular subsequences.常见分子子序列的鉴定

J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

454抗体测序——错误特征分析与校正

454 antibody sequencing - error characterization and correction.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献