Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia.
Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia.
Gigascience. 2020 Dec 21;9(12). doi: 10.1093/gigascience/giaa146.
Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample.
Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements.
The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
测序技术已经发展到可以生成高精度、单倍型分辨率、染色体规模的组装的地步。有几种长读测序技术可用,并且已经开发了越来越多的算法来组装这些技术生成的读取。因此,在启动新的基因组项目时,选择最具成本效益的测序技术以及最适合组装和抛光的软件具有挑战性。因此,对同一样本应用不同方法进行基准测试很重要。
在这里,我们报告了 3 种长读测序技术在从头组装植物基因组 Macadamia jansenii 中的比较。我们使用 Pacific Biosciences(Sequel I)、Oxford Nanopore Technologies(PromethION)和 BGI(单管长片段读取)技术为同一样本生成测序数据。在组装 Pacific Biosciences 和 Nanopore 读取时,对多个组装器进行了基准测试。还介绍了组合长读技术或短读和长读技术的结果。比较了组装的连续性、碱基准确性和完整性,以及测序成本和 DNA 材料要求。
3 种长读技术产生了高度连续和完整的 M. jansenii 基因组组装。在测序时,每种方法的相关成本差异很大,但技术的不断改进导致了更高的准确性、更高的通量和更低的成本。我们建议定期更新此比较,并报告测序技术的重要迭代报告。