• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于比对和组装的方法在长读测序数据结构变异检测中的权衡。

Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data.

机构信息

Department of Computer Science, Vanderbilt University, 37235, Nashville, TN, USA.

Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, TN, USA.

出版信息

Nat Commun. 2024 Mar 19;15(1):2447. doi: 10.1038/s41467-024-46614-z.

DOI:10.1038/s41467-024-46614-z
PMID:38503752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10951360/
Abstract

Long-read sequencing offers long contiguous DNA fragments, facilitating diploid genome assembly and structural variant (SV) detection. Efficient and robust algorithms for SV identification are crucial with increasing data availability. Alignment-based methods, favored for their computational efficiency and lower coverage requirements, are prominent. Alternative approaches, relying solely on available reads for de novo genome assembly and employing assembly-based tools for SV detection via comparison to a reference genome, demand significantly more computational resources. However, the lack of comprehensive benchmarking constrains our comprehension and hampers further algorithm development. Here we systematically compare 14 read alignment-based SV calling methods (including 4 deep learning-based methods and 1 hybrid method), and 4 assembly-based SV calling methods, alongside 4 upstream aligners and 7 assemblers. Assembly-based tools excel in detecting large SVs, especially insertions, and exhibit robustness to evaluation parameter changes and coverage fluctuations. Conversely, alignment-based tools demonstrate superior genotyping accuracy at low sequencing coverage (5-10×) and excel in detecting complex SVs, like translocations, inversions, and duplications. Our evaluation provides performance insights, highlighting the absence of a universally superior tool. We furnish guidelines across 31 criteria combinations, aiding users in selecting the most suitable tools for diverse scenarios and offering directions for further method development.

摘要

长读测序提供了长的连续 DNA 片段,有助于二倍体基因组组装和结构变异 (SV) 的检测。随着数据可用性的增加,高效稳健的 SV 识别算法至关重要。基于比对的方法因其计算效率高和覆盖要求低而受到青睐。替代方法仅依靠可用的读取进行从头组装,并通过与参考基因组进行比较使用组装为基础的工具来检测 SV,这需要显著更多的计算资源。然而,缺乏全面的基准测试限制了我们的理解,并阻碍了进一步的算法发展。在这里,我们系统地比较了 14 种基于读段比对的 SV 调用方法(包括 4 种基于深度学习的方法和 1 种混合方法)和 4 种基于组装的 SV 调用方法,以及 4 种上游比对器和 7 种组装器。基于组装的工具在检测大的 SV 方面表现出色,尤其是插入,并且对评估参数变化和覆盖波动具有稳健性。相比之下,基于比对的工具在低测序覆盖(5-10x)下表现出优越的基因分型准确性,并擅长检测复杂的 SV,如易位、倒位和重复。我们的评估提供了性能见解,突出了不存在普遍优越的工具。我们提供了 31 个标准组合的指南,帮助用户为不同的场景选择最合适的工具,并为进一步的方法发展提供方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/6dfead77e67d/41467_2024_46614_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/b7887a5a7f04/41467_2024_46614_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/ed842b9e06ee/41467_2024_46614_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/2ff95c8de3c8/41467_2024_46614_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/6b7a1d4e8fb5/41467_2024_46614_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/03b8019f23fc/41467_2024_46614_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/92e9e65747df/41467_2024_46614_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/6dfead77e67d/41467_2024_46614_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/b7887a5a7f04/41467_2024_46614_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/ed842b9e06ee/41467_2024_46614_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/2ff95c8de3c8/41467_2024_46614_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/6b7a1d4e8fb5/41467_2024_46614_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/03b8019f23fc/41467_2024_46614_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/92e9e65747df/41467_2024_46614_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1827/10951360/6dfead77e67d/41467_2024_46614_Fig7_HTML.jpg

相似文献

1
Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data.基于比对和组装的方法在长读测序数据结构变异检测中的权衡。
Nat Commun. 2024 Mar 19;15(1):2447. doi: 10.1038/s41467-024-46614-z.
2
VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing.VolcanoSV 可实现基于单分子长读测序的二倍体基因组中准确稳健的结构变异 calling。
Nat Commun. 2024 Aug 13;15(1):6956. doi: 10.1038/s41467-024-51282-0.
3
Comparison and benchmark of structural variants detected from long read and long-read assembly.长读与长读组装检测到的结构变异的比较和基准测试。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad188.
4
The impact of FASTQ and alignment read order on structural variant calling from long-read sequencing data.FASTQ 和比对读序对长读测序数据结构变异调用的影响。
PeerJ. 2024 Mar 15;12:e17101. doi: 10.7717/peerj.17101. eCollection 2024.
5
Benchmarking Oxford Nanopore read alignment-based insertion and deletion detection in crop plant genomes.基于牛津纳米孔测序reads 比对的插入缺失检测在作物基因组中的基准测试。
Plant Genome. 2023 Jun;16(2):e20314. doi: 10.1002/tpg2.20314. Epub 2023 Mar 29.
6
Benchmarking of structural variant detection in the tetraploid potato genome using linked-read sequencing.利用连锁读取测序对四倍体马铃薯基因组中的结构变异进行基准测试。
Genomics. 2023 Mar;115(2):110568. doi: 10.1016/j.ygeno.2023.110568. Epub 2023 Jan 23.
7
Automated filtering of genome-wide large deletions through an ensemble deep learning framework.通过集成深度学习框架自动筛选全基因组大片段缺失。
Methods. 2022 Oct;206:77-86. doi: 10.1016/j.ymeth.2022.08.001. Epub 2022 Aug 28.
8
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data.基准测试显示深度学习变异调用程序在细菌纳米孔测序数据上的优越性。
Elife. 2024 Oct 10;13:RP98300. doi: 10.7554/eLife.98300.
9
Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data.基于 Oxford nanopore 测序数据的结构变异检测的长读长比对软件和变异调用软件的基准测试。
Sci Rep. 2024 Mar 14;14(1):6160. doi: 10.1038/s41598-024-56604-2.
10
Structural Variant Detection from Long-Read Sequencing Data with cuteSV.使用 cuteSV 从长读测序数据中进行结构变异检测。
Methods Mol Biol. 2022;2493:137-151. doi: 10.1007/978-1-0716-2293-3_9.

引用本文的文献

1
Unlocking growth potential in Wenchang chickens: A 21 K genotyping array for trait-associated structural variations.挖掘文昌鸡的生长潜力:用于性状相关结构变异的21K基因分型芯片
Poult Sci. 2025 Aug 5;104(11):105631. doi: 10.1016/j.psj.2025.105631.
2
Long read whole genome sequencing-based discovery of structural variants and their role in aetiology of non-syndromic autism spectrum disorder in India.基于长读长全基因组测序发现结构变异及其在印度非综合征性自闭症谱系障碍病因学中的作用。
BMC Med Genomics. 2025 Aug 20;18(1):131. doi: 10.1186/s12920-025-02204-6.
3
Enhanced identification of novel pathogenic variants in hereditary hearing loss through physical phasing with integrated short and long-read sequencing data.

本文引用的文献

1
Detection of mosaic and population-level structural variants with Sniffles2.使用 Sniffles2 检测嵌合体和群体水平的结构变异。
Nat Biotechnol. 2024 Oct;42(10):1571-1580. doi: 10.1038/s41587-023-02024-y. Epub 2024 Jan 2.
2
Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation.可扩展的纳米孔测序技术对人类基因组进行测序,提供了全面的单倍型分辨率变异和甲基化视图。
Nat Methods. 2023 Oct;20(10):1483-1492. doi: 10.1038/s41592-023-01993-x. Epub 2023 Sep 14.
3
INSnet: a method for detecting insertions based on deep learning network.
通过整合短读长和长读长测序数据进行物理定相,增强遗传性听力损失中新的致病变异的识别。
Mol Genet Genomics. 2025 Jun 23;300(1):61. doi: 10.1007/s00438-025-02256-4.
4
The nature of complex structural variations in tomatoes.番茄复杂结构变异的本质。
Hortic Res. 2025 Apr 16;12(7):uhaf107. doi: 10.1093/hr/uhaf107. eCollection 2025 Jul.
5
SVHunter: long-read-based structural variation detection through the transformer model.SVHunter:通过变压器模型进行基于长读长的结构变异检测。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf203.
6
GKNnet: an relational graph convolutional network-based method with knowledge-augmented activation layer for microbial structural variation detection.GKNnet:一种基于关系图卷积网络且带有知识增强激活层的微生物结构变异检测方法。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf200.
7
Systematic benchmarking of tools for structural variation detection using short- and long-read sequencing data in pigs.利用猪的短读长和长读长测序数据对结构变异检测工具进行系统基准测试。
iScience. 2025 Feb 8;28(3):111983. doi: 10.1016/j.isci.2025.111983. eCollection 2025 Mar 21.
8
A Murine Database of Structural Variants Enables the Genetic Architecture of a Spontaneous Murine Lymphoma to be Characterized.一个小鼠结构变异数据库能够对一种自发性小鼠淋巴瘤的遗传结构进行表征。
bioRxiv. 2025 Jan 14:2025.01.09.632219. doi: 10.1101/2025.01.09.632219.
9
Long-read structural and epigenetic profiling of a kidney tumor-matched sample with nanopore sequencing and optical genome mapping.利用纳米孔测序和光学基因组图谱对肾肿瘤匹配样本进行长读长结构和表观遗传分析。
NAR Genom Bioinform. 2025 Jan 7;7(1):lqae190. doi: 10.1093/nargab/lqae190. eCollection 2025 Mar.
10
Fitness consequences of structural variation inferred from a House Finch pangenome.从白头翁基因组中推断出的结构变异对其健康状况的影响。
Proc Natl Acad Sci U S A. 2024 Nov 19;121(47):e2409943121. doi: 10.1073/pnas.2409943121. Epub 2024 Nov 12.
INSnet:一种基于深度学习网络的插入检测方法。
BMC Bioinformatics. 2023 Mar 6;24(1):80. doi: 10.1186/s12859-023-05216-0.
4
Telomere-to-telomere assembly of diploid chromosomes with Verkko.利用 Verkko 进行二倍体染色体的端粒到端粒组装。
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.
5
Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak.使用 DeBreak 对长测序reads 进行分析,以破译结构变异的精确断点。
Nat Commun. 2023 Jan 17;14(1):283. doi: 10.1038/s41467-023-35996-1.
6
Truvari: refined structural variant comparison preserves allelic diversity.特鲁瓦里:精细化结构变异比较保留等位基因多样性。
Genome Biol. 2022 Dec 27;23(1):271. doi: 10.1186/s13059-022-02840-6.
7
Structural variant analysis of a cancer reference cell line sample using multiple sequencing technologies.利用多种测序技术对癌症参考细胞系样本进行结构变异分析。
Genome Biol. 2022 Dec 13;23(1):255. doi: 10.1186/s13059-022-02816-6.
8
PBSIM3: a simulator for all types of PacBio and ONT long reads.PBSIM3:一款适用于所有类型的PacBio和ONT长读长的模拟器。
NAR Genom Bioinform. 2022 Dec 1;4(4):lqac092. doi: 10.1093/nargab/lqac092. eCollection 2022 Dec.
9
SVision: a deep learning approach to resolve complex structural variants.SVision:一种深度学习方法,用于解决复杂的结构变异。
Nat Methods. 2022 Oct;19(10):1230-1233. doi: 10.1038/s41592-022-01609-w. Epub 2022 Sep 16.
10
MAMnet: detecting and genotyping deletions and insertions based on long reads and a deep learning approach.MAMnet:基于长读长和深度学习方法检测和基因分型缺失和插入。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac195.