Department of Computational Biology, University of Lausanne, 1015, Lausanne, Switzerland.
SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
Genome Biol. 2023 Oct 5;24(1):221. doi: 10.1186/s13059-023-03061-1.
Genomic benchmark datasets are essential to driving the field of genomics and bioinformatics. They provide a snapshot of the performances of sequencing technologies and analytical methods and highlight future challenges. However, they depend on sequencing technology, reference genome, and available benchmarking methods. Thus, creating a genomic benchmark dataset is laborious and highly challenging, often involving multiple sequencing technologies, different variant calling tools, and laborious manual curation. In this review, we discuss the available benchmark datasets and their utility. Additionally, we focus on the most recent benchmark of genes with medical relevance and challenging genomic complexity.
基因组基准数据集对于推动基因组学和生物信息学领域至关重要。它们提供了测序技术和分析方法性能的快照,并突出了未来的挑战。然而,它们依赖于测序技术、参考基因组和可用的基准测试方法。因此,创建基因组基准数据集是一项费力且极具挑战性的工作,通常涉及多种测序技术、不同的变异调用工具以及繁琐的人工整理。在这篇综述中,我们讨论了可用的基准数据集及其用途。此外,我们还重点介绍了最近具有医学相关性和挑战性基因组复杂性的基因基准数据集。