Pacific Biosciences, Menlo Park, California, USA.
Nat Methods. 2013 Jun;10(6):563-9. doi: 10.1038/nmeth.2474. Epub 2013 May 5.
We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.
我们提出了一种分层基因组组装流程(HGAP),仅使用单个长插入片段 shotgun DNA 文库和单分子实时(SMRT)DNA 测序即可实现高质量的从头微生物基因组组装。我们的方法使用最长的读取片段作为种子,通过基于有向无环图的共识过程来招募所有其他读取片段,以构建高度准确的预组装读取片段,然后使用现成的长读取片段组装器进行组装。与混合方法不同,HGAP 不需要高度准确的原始读取片段来进行纠错。我们使用三个 SMRT Cell 零模式波导阵列进行测序,以及仅使用一个 SMRT Cell 进行 BAC 测序,对几种微生物进行了高效的基因组组装。该工作流程可以成功解决长重复区域的问题。我们还描述了一种共识算法,该算法结合了 SMRT 测序的原始质量值,以生成超过 99.999%准确性的从头基因组序列。