Suppr超能文献

大规模全基因组序列数据结构变异调用器的比较。

Comparison of structural variant callers for massive whole-genome sequence data.

机构信息

Korea Bioinformation Center (KOBIC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea.

Aging Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea.

出版信息

BMC Genomics. 2024 Mar 28;25(1):318. doi: 10.1186/s12864-024-10239-9.

Abstract

BACKGROUND

Detecting structural variations (SVs) at the population level using next-generation sequencing (NGS) requires substantial computational resources and processing time. Here, we compared the performances of 11 SV callers: Delly, Manta, GridSS, Wham, Sniffles, Lumpy, SvABA, Canvas, CNVnator, MELT, and INSurVeyor. These SV callers have been recently published and have been widely employed for processing massive whole-genome sequencing datasets. We evaluated the accuracy, sequence depth, running time, and memory usage of the SV callers.

RESULTS

Notably, several callers exhibited better calling performance for deletions than for duplications, inversions, and insertions. Among the SV callers, Manta identified deletion SVs with better performance and efficient computing resources, and both Manta and MELT demonstrated relatively good precision regarding calling insertions. We confirmed that the copy number variation callers, Canvas and CNVnator, exhibited better performance in identifying long duplications as they employ the read-depth approach. Finally, we also verified the genotypes inferred from each SV caller using a phased long-read assembly dataset, and Manta showed the highest concordance in terms of the deletions and insertions.

CONCLUSIONS

Our findings provide a comprehensive understanding of the accuracy and computational efficiency of SV callers, thereby facilitating integrative analysis of SV profiles in diverse large-scale genomic datasets.

摘要

背景

使用下一代测序(NGS)在人群水平上检测结构变异(SV)需要大量的计算资源和处理时间。在这里,我们比较了 11 种 SV 调用者的性能:Delly、Manta、GridSS、Wham、Sniffles、Lumpy、SvABA、Canvas、CNVnator、MELT 和 INSurVeyor。这些 SV 调用者最近已经发表,并被广泛用于处理大规模全基因组测序数据集。我们评估了 SV 调用者的准确性、序列深度、运行时间和内存使用情况。

结果

值得注意的是,一些调用者在缺失调用方面的性能优于重复、反转和插入。在 SV 调用者中,Manta 在删除 SV 方面表现出更好的性能和高效的计算资源,Manta 和 MELT 在插入调用方面都表现出相对较好的精度。我们证实,使用读深度方法的拷贝数变异调用者 Canvas 和 CNVnator 在识别长重复方面表现出更好的性能。最后,我们还使用相位长读组装数据集验证了每个 SV 调用者推断的基因型,Manta 在删除和插入方面表现出最高的一致性。

结论

我们的研究结果提供了对 SV 调用者准确性和计算效率的全面了解,从而促进了对不同大规模基因组数据集 SV 谱的综合分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9809/10976732/dc61738eb4bf/12864_2024_10239_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验