Computer Science and Engineering, University of Michigan, 2260 Hayward Street, Ann Arbor, MI, 48109, USA.
Nat Commun. 2023 Dec 9;14(1):8149. doi: 10.1038/s41467-023-43876-x.
Accurately benchmarking small variant calling accuracy is critical for the continued improvement of human whole genome sequencing. In this work, we show that current variant calling evaluations are biased towards certain variant representations and may misrepresent the relative performance of different variant calling pipelines. We propose solutions, first exploring the affine gap parameter design space for complex variant representation and suggesting a standard. Next, we present our tool vcfdist and demonstrate the importance of enforcing local phasing for evaluation accuracy. We then introduce the notion of partial credit for mostly-correct calls and present an algorithm for clustering dependent variants. Lastly, we motivate using alignment distance metrics to supplement precision-recall curves for understanding variant calling performance. We evaluate the performance of 64 phased Truth Challenge V2 submissions and show that vcfdist improves measured insertion and deletion performance consistency across variant representations from R = 0.97243 for baseline vcfeval to 0.99996 for vcfdist.
准确地对小型变异调用准确性进行基准测试对于人类全基因组测序的持续改进至关重要。在这项工作中,我们表明,当前的变异调用评估偏向于某些变异表示,并且可能会对不同变异调用管道的相对性能产生误解。我们提出了一些解决方案,首先探索复杂变异表示的仿射间隙参数设计空间,并提出了一个标准。接下来,我们介绍了我们的工具 vcfdist,并演示了为评估准确性强制执行局部定相的重要性。然后,我们引入了部分信用的概念,用于主要正确的调用,并提出了一种用于聚类相关变体的算法。最后,我们提出使用对齐距离度量来补充精度-召回曲线,以了解变异调用性能。我们评估了 64 个分相 Truth Challenge V2 提交的性能,并表明 vcfdist 提高了从基线 vcfeval 的 R=0.97243 到 vcfdist 的 0.99996 的各种变异表示的插入和删除性能一致性的测量。