Suppr超能文献

引物ID验证模板采样深度并大幅降低HIV-1基因组RNA群体下一代测序的错误率。

Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations.

作者信息

Zhou Shuntai, Jones Corbin, Mieczkowski Piotr, Swanstrom Ronald

机构信息

UNC Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

出版信息

J Virol. 2015 Aug;89(16):8540-55. doi: 10.1128/JVI.00522-15. Epub 2015 Jun 3.

Abstract

UNLABELLED

Validating the sampling depth and reducing sequencing errors are critical for studies of viral populations using next-generation sequencing (NGS). We previously described the use of Primer ID to tag each viral RNA template with a block of degenerate nucleotides in the cDNA primer. We now show that low-abundance Primer IDs (offspring Primer IDs) are generated due to PCR/sequencing errors. These artifactual Primer IDs can be removed using a cutoff model for the number of reads required to make a template consensus sequence. We have modeled the fraction of sequences lost due to Primer ID resampling. For a typical sequencing run, less than 10% of the raw reads are lost to offspring Primer ID filtering and resampling. The remaining raw reads are used to correct for PCR resampling and sequencing errors. We also demonstrate that Primer ID reveals bias intrinsic to PCR, especially at low template input or utilization. cDNA synthesis and PCR convert ca. 20% of RNA templates into recoverable sequences, and 30-fold sequence coverage recovers most of these template sequences. We have directly measured the residual error rate to be around 1 in 10,000 nucleotides. We use this error rate and the Poisson distribution to define the cutoff to identify preexisting drug resistance mutations at low abundance in an HIV-infected subject. Collectively, these studies show that >90% of the raw sequence reads can be used to validate template sampling depth and to dramatically reduce the error rate in assessing a genetically diverse viral population using NGS.

IMPORTANCE

Although next-generation sequencing (NGS) has revolutionized sequencing strategies, it suffers from serious limitations in defining sequence heterogeneity in a genetically diverse population, such as HIV-1 due to PCR resampling and PCR/sequencing errors. The Primer ID approach reveals the true sampling depth and greatly reduces errors. Knowing the sampling depth allows the construction of a model of how to maximize the recovery of sequences from input templates and to reduce resampling of the Primer ID so that appropriate multiplexing can be included in the experimental design. With the defined sampling depth and measured error rate, we are able to assign cutoffs for the accurate detection of minority variants in viral populations. This approach allows the power of NGS to be realized without having to guess about sampling depth or to ignore the problem of PCR resampling, while also being able to correct most of the errors in the data set.

摘要

未标记

验证采样深度和减少测序错误对于使用下一代测序(NGS)研究病毒群体至关重要。我们之前描述了使用引物ID在cDNA引物中用一段简并核苷酸标记每个病毒RNA模板。我们现在表明,由于PCR/测序错误会产生低丰度的引物ID(子代引物ID)。这些人为产生的引物ID可以使用一个截止模型去除,该模型用于确定生成模板一致序列所需的读数数量。我们已经对由于引物ID重新采样而丢失的序列比例进行了建模。对于一次典型的测序运行,不到10%的原始读数会因子代引物ID过滤和重新采样而丢失。其余的原始读数用于校正PCR重新采样和测序错误。我们还证明,引物ID揭示了PCR固有的偏差,尤其是在低模板输入或利用率的情况下。cDNA合成和PCR可将约20%的RNA模板转化为可回收序列,30倍的序列覆盖度可回收大多数这些模板序列。我们直接测量的残留错误率约为每10000个核苷酸中有1个错误。我们使用这个错误率和泊松分布来定义截止值,以识别HIV感染个体中低丰度的预先存在的耐药性突变。总体而言,这些研究表明,超过90%的原始序列读数可用于验证模板采样深度,并显著降低使用NGS评估基因多样化病毒群体时的错误率。

重要性

尽管下一代测序(NGS)彻底改变了测序策略,但由于PCR重新采样和PCR/测序错误,在定义基因多样化群体(如HIV-1)中的序列异质性方面存在严重局限性。引物ID方法揭示了真实的采样深度并大大减少了错误。了解采样深度有助于构建一个模型,该模型用于说明如何最大限度地从输入模板中回收序列,并减少引物ID的重新采样,以便在实验设计中纳入适当的多重分析。有了定义的采样深度和测量的错误率,我们能够为准确检测病毒群体中的少数变异体设定截止值。这种方法能够充分发挥NGS的能力,而无需猜测采样深度或忽略PCR重新采样问题,同时还能够校正数据集中的大多数错误。

相似文献

8

引用本文的文献

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验