NextDenovo：一种用于处理有噪声长读段的高效纠错和精确组装工具。

Hu Jiang, Wang Zhuo, Sun Zongyi, Hu Benxia, Ayoola Adeola Oluwakemi, Liang Fan, Li Jingjing, Sandoval José R, Cooper David N, Ye Kai, Ruan Jue, Xiao Chuan-Le, Wang Depeng, Wu Dong-Dong, Wang Sheng

GrandOmics Biosciences, Beijing, 102206, China.

School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.

Genome Biol. 2024 Apr 26;25(1):107. doi: 10.1186/s13059-024-03252-4.

Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.

长读长测序数据，尤其是那些来自牛津纳米孔测序平台的数据，往往表现出较高的错误率。在此，我们展示了NextDenovo，这是一种用于有噪声长读长的高效纠错和组装工具，它在基因组组装中实现了高水平的准确性。我们应用NextDenovo，使用纳米孔长读长数据组装来自世界各地的35个不同人类基因组。这些基因组使我们能够识别现代人类群体中片段重复和基因拷贝数变异的情况。NextDenovo的使用应为利用纳米孔长读长数据进行群体规模的长读长组装铺平道路。