Suppr超能文献

长读测序平台和拷贝数变异检测调用算法的综合评估。

Comprehensive assessment of long-read sequencing platforms and calling algorithms for detection of copy number variation.

机构信息

National Genomics Data Center, China National Center for Bioinformation, Beichen West Road, Chaoyang District, Beijing 100101, China.

Beijing Institute of Genomics, Chinese Academy of Sciences, Beichen West Road, Chaoyang District, Beijing 100101, China.

出版信息

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae441.

Abstract

Copy number variations (CNVs) play pivotal roles in disease susceptibility and have been intensively investigated in human disease studies. Long-read sequencing technologies offer opportunities for comprehensive structural variation (SV) detection, and numerous methodologies have been developed recently. Consequently, there is a pressing need to assess these methods and aid researchers in selecting appropriate techniques for CNV detection using long-read sequencing. Hence, we conducted an evaluation of eight CNV calling methods across 22 datasets from nine publicly available samples and 15 simulated datasets, covering multiple sequencing platforms. The overall performance of CNV callers varied substantially and was influenced by the input dataset type, sequencing depth, and CNV type, among others. Specifically, the PacBio CCS sequencing platform outperformed PacBio CLR and Nanopore platforms regarding CNV detection recall rates. A sequencing depth of 10x demonstrated the capability to identify 85% of the CNVs detected in a 50x dataset. Moreover, deletions were more generally detectable than duplications. Among the eight benchmarked methods, cuteSV, Delly, pbsv, and Sniffles2 demonstrated superior accuracy, while SVIM exhibited high recall rates.

摘要

拷贝数变异 (CNVs) 在疾病易感性中起着关键作用,并在人类疾病研究中得到了深入研究。长读测序技术为全面的结构变异 (SV) 检测提供了机会,最近已经开发了许多方法。因此,迫切需要评估这些方法,并帮助研究人员选择使用长读测序进行 CNV 检测的合适技术。因此,我们评估了 8 种 CNV 调用方法在来自 9 个公开样本和 15 个模拟数据集的 22 个数据集上的性能,涵盖了多个测序平台。CNV 调用器的整体性能差异很大,受到输入数据集类型、测序深度和 CNV 类型等因素的影响。具体来说,PacBio CCS 测序平台在 CNV 检测召回率方面优于 PacBio CLR 和 Nanopore 平台。10x 的测序深度可以识别 50x 数据集检测到的 85%的 CNVs。此外,缺失比重复更普遍可检测。在 8 种基准方法中,cuteSV、Delly、pbsv 和 Sniffles2 表现出较高的准确性,而 SVIM 则具有较高的召回率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1075/11387058/d75aa1f3eac8/bbae441f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验