School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
GrandOmics Biosciences, Beijing 102206, China.
Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzad009.
The high-fidelity (HiFi) long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies. However, these assemblies still contain base-level errors, particularly within the error-prone regions of HiFi long reads. Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads. Here, we describe an upgraded genome polishing tool - NextPolish2, which can fix base errors remaining in those "highly accurate" genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors. We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere (T2T) genomes. NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.
PacBio 开发的高保真(HiFi)长读测序技术极大地提高了基因组组装的碱基准确率。然而,这些组装仍然包含碱基错误,特别是在 HiFi 长读的易错区域。当校正 HiFi 长读组装的基因组中的错误时,现有的基因组抛光工具通常会引入过度校正和单倍型转换错误。在这里,我们描述了一个升级的基因组抛光工具——NextPolish2,它可以在不引入过度校正和单倍型转换错误的情况下,修复从 HiFi 长读组装的那些“高度准确”基因组中剩余的碱基错误。我们相信,NextPolish2 对进一步提高端粒到端粒(T2T)基因组的准确性具有重要意义。NextPolish2 可在 https://github.com/Nextomics/NextPolish2 上免费获得。