Summerer Daniel, Wu Haiguo, Haase Bettina, Cheng Yang, Schracke Nadine, Stähler Cord F, Chee Mark S, Stähler Peer F, Beier Markus
febit biomed gmbh, 69120 Heidelberg, Germany.
Genome Res. 2009 Sep;19(9):1616-21. doi: 10.1101/gr.091942.109. Epub 2009 Jul 28.
The lack of efficient high-throughput methods for enrichment of specific sequences from genomic DNA represents a key bottleneck in exploiting the enormous potential of next-generation sequencers. Such methods would allow for a systematic and targeted analysis of relevant genomic regions. Recent studies reported sequence enrichment using a hybridization step to specific DNA capture probes as a possible solution to the problem. However, so far no method has provided sufficient depths of coverage for reliable base calling over the entire target regions. We report a strategy to multiply the enrichment performance and consequently improve depth and breadth of coverage for desired target sequences by applying two iterative cycles of hybridization with microfluidic Geniom biochips. Using this strategy, we enriched and then sequenced the cancer-related genes BRCA1 and TP53 and a set of 1000 individual dbSNP regions of 500 bp using Illumina technology. We achieved overall enrichment factors of up to 1062-fold and average coverage depths of 470-fold. Combined with high coverage uniformity, this resulted in nearly complete consensus coverages with >86% of target region covered at 20-fold or higher. Analysis of SNP calling accuracies after enrichment revealed excellent concordance, with the reference sequence closely mirroring the previously reported performance of Illumina sequencing conducted without sequence enrichment.
缺乏从基因组DNA中富集特定序列的高效高通量方法,是挖掘下一代测序仪巨大潜力的关键瓶颈。此类方法将允许对相关基因组区域进行系统且有针对性的分析。最近的研究报道,使用与特定DNA捕获探针的杂交步骤进行序列富集,可能是解决该问题的一种方法。然而,到目前为止,尚无方法能在整个目标区域提供足够的覆盖深度以进行可靠的碱基识别。我们报告了一种策略,即通过应用与微流控Geniom生物芯片的两个迭代杂交循环,成倍提高富集性能,从而改善所需目标序列的覆盖深度和广度。使用该策略,我们利用Illumina技术对癌症相关基因BRCA1和TP53以及一组1000个500 bp的个体dbSNP区域进行了富集然后测序。我们实现了高达1062倍的总体富集因子和470倍的平均覆盖深度。结合高覆盖均匀性,这导致了几乎完全的一致性覆盖,超过86%的目标区域被20倍或更高倍数覆盖。富集后对SNP识别准确性的分析显示出极佳的一致性,参考序列与先前报道的未进行序列富集的Illumina测序性能非常相似。