Michigan Department of Health and Human Services, Bureau of Laboratories, Lansing, MI, 48906, USA.
Microb Genom. 2023 May;9(5). doi: 10.1099/mgen.0.001024.
Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences - an issue that has been associated with plasmid size. The purpose of this study was to investigate the relationship between plasmid size and plasmid recovery by the long-read-only assemblers, Flye, Raven, Miniasm, and Canu. This was accomplished by determining the number of times each assembler successfully recovered 33 plasmids, ranging from 1919 to 194 062 bp in size and belonging to 14 bacterial isolates from six bacterial genera, using Oxford Nanopore long reads. These results were additionally compared to plasmid recovery rates by the short-read-first assembler, Unicycler, using both Oxford Nanopore long reads and Illumina short reads. Results from this study indicate that Canu, Flye, Miniasm, and Raven are prone to missing plasmid sequences, whereas Unicycler was successful at recovering 100 % of plasmid sequences. Excluding Canu, most plasmid loss by long-read-only assemblers was due to failure to recover plasmids smaller than 10 kb. As such, it is recommended that Unicycler be used to increase the likelihood of plasmid recovery during bacterial genome assembly.
全基因组测序已成为研究细菌质粒的首选方法,因为人们普遍认为它可以捕获整个基因组。然而,长读长基因组组装器有时会错过质粒序列,这一问题与质粒大小有关。本研究旨在探讨长读长组装器 Flye、Raven、Miniasm 和 Canu 与质粒回收之间的关系。通过使用 Oxford Nanopore 长读长确定每个组装器成功回收 33 个质粒的次数来实现,这些质粒大小从 1919 到 194,062 bp 不等,属于来自六个细菌属的 14 个细菌分离株。这些结果还与短读长优先组装器 Unicycler 使用 Oxford Nanopore 长读长和 Illumina 短读长回收质粒的比率进行了比较。本研究结果表明,Canu、Flye、Miniasm 和 Raven 容易错过质粒序列,而 Unicycler 成功回收了 100%的质粒序列。除了 Canu 之外,大多数长读长组装器的质粒丢失是由于无法回收小于 10 kb 的质粒所致。因此,建议在细菌基因组组装过程中使用 Unicycler 来提高质粒回收的可能性。