Department of Genetics, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
Informatics Institute, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35294, USA.
Genome Biol. 2021 Nov 14;22(1):312. doi: 10.1186/s13059-021-02527-4.
Long-read de novo genome assembly continues to advance rapidly. However, there is a lack of effective tools to accurately evaluate the assembly results, especially for structural errors. We present Inspector, a reference-free long-read de novo assembly evaluator which faithfully reports types of errors and their precise locations. Notably, Inspector can correct the assembly errors based on consensus sequences derived from raw reads covering erroneous regions. Based on in silico and long-read assembly results from multiple long-read data and assemblers, we demonstrate that in addition to providing generic metrics, Inspector can accurately identify both large-scale and small-scale assembly errors.
长读长片段从头基因组组装技术持续快速发展。然而,目前缺乏有效的工具来准确评估组装结果,特别是结构错误。我们提出了 Inspector,这是一种无参考的长读长片段从头基因组组装评估工具,可以准确报告错误类型及其精确位置。值得注意的是,Inspector 可以基于覆盖错误区域的原始读取的共识序列来纠正组装错误。基于多个长读数据和组装器的计算机模拟和长读组装结果,我们证明除了提供通用指标外,Inspector 还可以准确识别大规模和小规模的组装错误。