Suppr超能文献

下一代测序数据纠错和 HIV 准种可靠估计。

Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies.

机构信息

Department of Biosystems Sciences and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Germany.

出版信息

Nucleic Acids Res. 2010 Nov;38(21):7400-9. doi: 10.1093/nar/gkq655. Epub 2010 Jul 29.

Abstract

Next-generation sequencing technologies can be used to analyse genetically heterogeneous samples at unprecedented detail. The high coverage achievable with these methods enables the detection of many low-frequency variants. However, sequencing errors complicate the analysis of mixed populations and result in inflated estimates of genetic diversity. We developed a probabilistic Bayesian approach to minimize the effect of errors on the detection of minority variants. We applied it to pyrosequencing data obtained from a 1.5-kb-fragment of the HIV-1 gag/pol gene in two control and two clinical samples. The effect of PCR amplification was analysed. Error correction resulted in a two- and five-fold decrease of the pyrosequencing base substitution rate, from 0.05% to 0.03% and from 0.25% to 0.05% in the non-PCR and PCR-amplified samples, respectively. We were able to detect viral clones as rare as 0.1% with perfect sequence reconstruction. Probabilistic haplotype inference outperforms the counting-based calling method in both precision and recall. Genetic diversity observed within and between two clinical samples resulted in various patterns of phenotypic drug resistance and suggests a close epidemiological link. We conclude that pyrosequencing can be used to investigate genetically diverse samples with high accuracy if technical errors are properly treated.

摘要

下一代测序技术可以以前所未有的细节分析遗传异质性样本。这些方法可实现的高覆盖率可检测到许多低频变体。然而,测序错误使混合人群的分析变得复杂,并导致遗传多样性的估计值膨胀。我们开发了一种概率贝叶斯方法来最小化错误对少数变体检测的影响。我们将其应用于从 HIV-1 gag/pol 基因的 1.5kb 片段中获得的两个对照和两个临床样本的焦磷酸测序数据。分析了 PCR 扩增的影响。错误校正导致焦磷酸测序碱基替换率分别从非 PCR 和 PCR 扩增样品中的 0.05%降至 0.03%和从 0.25%降至 0.05%。我们能够以完美的序列重建检测到低至 0.1%的病毒克隆。概率单倍型推断在精确性和召回率方面均优于基于计数的调用方法。两个临床样本内和之间观察到的遗传多样性导致了各种表型耐药性模式,并表明存在密切的流行病学联系。我们得出结论,如果正确处理技术错误,焦磷酸测序可以用于高度准确地研究遗传多样的样本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a4c/2995073/8adb33e013ae/gkq655f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验