Xu Chang, Nezami Ranjbar Mohammad R, Wu Zhong, DiCarlo John, Wang Yexun
Life Science Research and Foundation, Qiagen Sciences, Inc., 6951 Executive Way, Frederick, Maryland, 21703, USA.
BMC Genomics. 2017 Jan 3;18(1):5. doi: 10.1186/s12864-016-3425-4.
Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models.
We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions.
We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
以高精度检测极低等位基因分数下的DNA突变将显著提高癌症患者精准医疗的有效性。为了通过下一代测序实现这一目标,研究人员需要一种检测方法,该方法要能:1)在大量野生型DNA的混合物中高效捕获含罕见突变的DNA片段;2)对DNA文库进行广泛测序以达到深度覆盖;3)高精度地区分低水平的真实变异与扩增和测序错误。使用PCR引物进行靶向富集为研究人员提供了一种便捷的方法,可使用台式测序仪对一个小的但最相关的区域进行深度测序。分子条形码(或索引)为从分析上减少测序假象提供了独特的解决方案。尽管最近的文献报道了不同的分子条形码方案,但大多数变异检测是在有限的目标上使用简单的自定义脚本完成的。通过纳入先进的统计模型,可以显著提高条形码感知变异检测的分析性能。
我们在此展示了一种高效、简单且可扩展的富集方案,该方案在多重PCR扩增中整合了分子条形码。此外,我们开发了smCounter,这是一种基于贝叶斯概率模型的开源、通用、条形码感知变异检测工具。smCounter在两个独立的读集上进行了优化和基准测试,这些读集包含等位基因分数为5%和1%的单核苷酸变异(SNV)和插入缺失(indel)。在编码区域内以非常好的灵敏度和特异性检测到变异。
我们证明,使用我们的富集方案和变异检测工具,可以在编码区域准确检测到等位基因分数低至1%的体细胞突变。