Nuffield Department of Medicine, University of Oxford, Oxford, UK.
Department of Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK.
Microb Genom. 2019 Sep;5(9). doi: 10.1099/mgen.0.000294. Epub 2019 Aug 30.
Illumina sequencing allows rapid, cheap and accurate whole genome bacterial analyses, but short reads (<300 bp) do not usually enable complete genome assembly. Long-read sequencing greatly assists with resolving complex bacterial genomes, particularly when combined with short-read Illumina data (hybrid assembly). However, it is not clear how different long-read sequencing methods affect hybrid assembly accuracy. Relative automation of the assembly process is also crucial to facilitating high-throughput complete bacterial genome reconstruction, avoiding multiple bespoke filtering and data manipulation steps. In this study, we compared hybrid assemblies for 20 bacterial isolates, including two reference strains, using Illumina sequencing and long reads from either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio) sequencing platforms. We chose isolates from the family , as these frequently have highly plastic, repetitive genetic structures, and complete genome reconstruction for these species is relevant for a precise understanding of the epidemiology of antimicrobial resistance. We assembled genomes using the hybrid assembler Unicycler and compared different read processing strategies, as well as comparing to long-read-only assembly with Flye followed by short-read polishing with Pilon. Hybrid assembly with either PacBio or ONT reads facilitated high-quality genome reconstruction, and was superior to the long-read assembly and polishing approach evaluated with respect to accuracy and completeness. Combining ONT and Illumina reads fully resolved most genomes without additional manual steps, and at a lower consumables cost per isolate in our setting. Automated hybrid assembly is a powerful tool for complete and accurate bacterial genome assembly.
Illumina 测序允许快速、廉价和准确的全基因组细菌分析,但短读长(<300bp)通常无法实现完整基因组组装。长读长测序极大地有助于解决复杂的细菌基因组问题,特别是与短读长 Illumina 数据(混合组装)结合使用时。然而,不同的长读长测序方法如何影响混合组装的准确性尚不清楚。组装过程的相对自动化对于促进高通量完整细菌基因组的重建也至关重要,可以避免多次定制过滤和数据处理步骤。在这项研究中,我们使用 Illumina 测序和来自 Oxford Nanopore Technologies (ONT) 或 SMRT Pacific Biosciences (PacBio) 测序平台的长读长,比较了 20 个细菌分离株的混合组装,包括两个参考菌株。我们选择了科的分离株,因为这些分离株通常具有高度可塑性、重复的遗传结构,而这些物种的完整基因组重建对于精确理解抗生素耐药性的流行病学具有重要意义。我们使用混合组装器 Unicycler 组装基因组,并比较了不同的读处理策略,以及将 Flye 用于长读组装后使用 Pilon 进行短读修正的方法。使用 PacBio 或 ONT 读长进行混合组装有利于高质量的基因组重建,并且在准确性和完整性方面优于使用长读组装和修正方法进行评估。在我们的设置中,结合 ONT 和 Illumina 读长完全解决了大多数基因组,而无需额外的手动步骤,并且每个分离株的耗材成本更低。自动化混合组装是一种用于完整和准确细菌基因组组装的强大工具。