Wang Peng, Wang Fei
Key Laboratory of Crop Gene Resources and Germplasm Enhancement in Southern China, Ministry of Agriculture and Rural Affairs, Institute of Tropical Crop Genetic Resources, Chinese Academy of Tropical Agricultural Sciences, No. 4 Xueyuan Rd, Haikou City, Hainan 571101, China.
School of Electrical and Electronic Engineering, Shanghai Institute of Technology, No. 100 Haiquan Rd, Shanghai 201416, China.
Trends Genet. 2023 Mar;39(3):175-186. doi: 10.1016/j.tig.2022.10.005. Epub 2022 Nov 17.
Quality control is essential for genome assemblies; however, a consensus has yet to be reached on what metrics should be adopted for the evaluation of assembly quality. N50 is widely used for contiguity measurement, but its effectiveness is constantly in question. Prevailing metrics for the completeness evaluation focus on gene space, yet challenging areas such as tandem repeats are commonly overlooked. Achieving correctness has become an indispensable dimension for quality control, while prevailing assembly releases lack scores reflecting this aspect. We propose a metric set with a set of statistic indexes for effective, comprehensive evaluation of assemblies and provide a score of a finished assembly for each metric, which can be utilized as a benchmark for achieving high-quality genome assemblies.
质量控制对于基因组组装至关重要;然而,对于应采用哪些指标来评估组装质量尚未达成共识。N50被广泛用于衡量连续性,但它的有效性一直受到质疑。完整性评估的主流指标侧重于基因空间,但串联重复等具有挑战性的区域通常被忽视。实现正确性已成为质量控制不可或缺的一个维度,而主流的组装版本缺乏反映这方面的分数。我们提出了一套指标集,其中包含一组统计指标,用于对组装进行有效、全面的评估,并为每个指标提供一个完成组装的分数,这些分数可作为实现高质量基因组组装的基准。