Department of Computer Architecture, University of Malaga, Louis Pasteur, 35, Campus de Teatinos, Malaga 29071, Spain.
Supercomputing and Bioinnovation Center, University of Malaga, C. Severo Ochoa, 34, Malaga 29590, Spain.
Genomics. 2023 Sep;115(5):110700. doi: 10.1016/j.ygeno.2023.110700. Epub 2023 Aug 18.
The recent advent of long-read sequencing technologies, such as Pacific Biosciences (PacBio) and Oxford Nanopore technology (ONT), has led to substantial accuracy and computational cost improvements. However, de novo whole-genome assembly still presents significant challenges related to the computational cost and the quality of the results. Accordingly, sequencing accuracy and throughput continue to improve, and many tools are constantly emerging. Therefore, selecting the correct sequencing platform, the proper sequencing depth and the assembly tools are necessary to perform high-quality assembly. This paper evaluates the primary assembly reconstruction from recent hybrid and non-hybrid pipelines on different genomes. We find that using PacBio high-fidelity long-read (HiFi) plays an essential role in haplotype construction with respect to ONT reads. However, we observe a substantial improvement in the correctness of the assembly from high-fidelity ONT datasets and combining it with HiFi or short-reads.
近年来,长读测序技术(如 Pacific Biosciences (PacBio) 和 Oxford Nanopore technology (ONT))的出现,极大地提高了测序的准确性和降低了计算成本。然而,从头进行全基因组组装仍然面临着与计算成本和结果质量相关的重大挑战。因此,测序的准确性和通量在不断提高,许多工具也在不断涌现。因此,选择正确的测序平台、适当的测序深度和组装工具对于进行高质量的组装是必要的。本文评估了不同基因组上基于混合和非混合管道的主要组装重建。我们发现,使用 Pacific Biosciences 的高保真长读(HiFi)对 ONT reads 的单倍型构建起着至关重要的作用。然而,我们观察到从高保真 ONT 数据集获得的组装正确性有了显著提高,并将其与 HiFi 或短读相结合。