Suppr超能文献

454抗体测序——错误特征分析与校正

454 antibody sequencing - error characterization and correction.

作者信息

Prabakaran Ponraj, Streaker Emily, Chen Weizao, Dimitrov Dimiter S

机构信息

Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), Frederick, MD 21702-1201, USA.

出版信息

BMC Res Notes. 2011 Oct 12;4:404. doi: 10.1186/1756-0500-4-404.

Abstract

BACKGROUND

454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing.

RESULTS

We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes).

CONCLUSIONS

The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.

摘要

背景

454测序目前是抗体库测序的首选方法,这些抗体库包含大量(106至1012)具有相似框架和可变区的不同分子,这给识别测序错误带来了重大挑战。识别和纠正此类混合物中的测序错误对于探索复杂的成熟途径以及识别高度体细胞突变抗体的推定种系前体尤为重要。为了量化和纠正454抗体测序中引入的错误,我们对六种不同已知浓度的抗体进行了两次重复测序,并将其与通过标准桑格测序确定的相应已知序列进行比较。

结果

我们发现,454抗体测序可能会导致约20%的错误读数,这些错误主要是由于插入造成的,大多出现在长度为2 - 3个核苷酸的较短同聚物区域,随机位点的插入、缺失和其他变体导致的错误较少。错误校正可能会将这些错误读数的比例降低至5 - 10%。然而,仍有一定数量(占总读数的4 - 8%)的错误无法校正,除非进行多次重复测序,尽管对于包含完整抗体集(抗体组)的大型多样文库和库来说这可能无法实现。

结论

为评估454抗体测序错误而进行的实验测试程序显示,错误读数比例较高(高达20%);错误可减少至5 - 10%,但无法更低,这表明在避免错误发现抗体变体和多样性时需谨慎。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec59/3228814/b1b32fa50488/1756-0500-4-404-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验