Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA 90095, USA.
Indian Institute of Technology Delhi, Hauz Khas, New Delhi, Delhi 110016, India.
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac221.
Advances in whole-genome sequencing (WGS) promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from WGS data presents a substantial number of challenges and a plethora of SV detection methods have been developed. Currently, evidence that investigators can use to select appropriate SV detection tools is lacking. In this article, we have evaluated the performance of SV detection tools on mouse and human WGS data using a comprehensive polymerase chain reaction-confirmed gold standard set of SVs and the genome-in-a-bottle variant set, respectively. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of the SV detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance as the SV detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low- and ultralow-pass sequencing data as well as for different deletion length categories.
全基因组测序 (WGS) 的进展有望实现准确和全面的结构变异 (SV) 发现。从 WGS 数据中解析 SV 提出了大量挑战,并且已经开发了大量的 SV 检测方法。目前,研究人员缺乏可以用来选择适当的 SV 检测工具的证据。在本文中,我们分别使用经聚合酶链反应 (PCR) 全面验证的 SV 综合金标准集和“基因组瓶中变体”集,评估了 SV 检测工具在小鼠和人类 WGS 数据上的性能。与之前的基准测试研究不同,我们的金标准数据集包含了一套完整的 SV,使我们能够报告 SV 检测方法的精确率和灵敏度。我们的研究调查了这些方法检测缺失的能力,从而为 SV 检测性能提供了一个乐观的估计,因为未能检测到缺失的 SV 检测方法很可能会错过更复杂的 SV。我们发现,SV 检测工具的性能差异很大,其中几种方法在灵敏度和精确率之间取得了很好的平衡。此外,我们还确定了最适合低和超低深度测序数据以及不同缺失长度类别的 SV 调用者。