• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

长读与长读组装检测到的结构变异的比较和基准测试。

Comparison and benchmark of structural variants detected from long read and long-read assembly.

机构信息

MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

出版信息

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad188.

DOI:10.1093/bib/bbad188
PMID:37200087
Abstract

Structural variant (SV) detection is essential for genomic studies, and long-read sequencing technologies have advanced our capacity to detect SVs directly from read or de novo assembly, also known as read-based and assembly-based strategy. However, to date, no independent studies have compared and benchmarked the two strategies. Here, on the basis of SVs detected by 20 read-based and eight assembly-based detection pipelines from six datasets of HG002 genome, we investigated the factors that influence the two strategies and assessed their performance with well-curated SVs. We found that up to 80% of the SVs could be detected by both strategies among different long-read datasets, whereas variant type, size, and breakpoint detected by read-based strategy were greatly affected by aligners. For the high-confident insertions and deletions at non-tandem repeat regions, a remarkable subset of them (82% in assembly-based calls and 93% in read-based calls), accounting for around 4000 SVs, could be captured by both reads and assemblies. However, discordance between two strategies was largely caused by complex SVs and inversions, which resulted from inconsistent alignment of reads and assemblies at these loci. Finally, benchmarking with SVs at medically relevant genes, the recall of read-based strategy reached 77% on 5X coverage data, whereas assembly-based strategy required 20X coverage data to achieve similar performance. Therefore, integrating SVs from read and assembly is suggested for general-purpose detection because of inconsistently detected complex SVs and inversions, whereas assembly-based strategy is optional for applications with limited resources.

摘要

结构变异 (SV) 检测对于基因组研究至关重要,长读测序技术提高了我们从读取或从头组装中直接检测 SV 的能力,也称为基于读取和基于组装的策略。然而,迄今为止,尚无独立的研究比较和基准测试这两种策略。在这里,基于 HG002 基因组六个数据集的 20 个基于读取和八个基于组装的检测管道检测到的 SV,我们研究了影响这两种策略的因素,并使用精心筛选的 SV 评估了它们的性能。我们发现,在不同的长读数据集之间,多达 80%的 SV 可以通过这两种策略检测到,而基于读取的策略检测到的变异类型、大小和断点受对齐器的影响很大。对于非串联重复区域的高置信插入和缺失,其中一个显著的子集(基于组装的调用中有 82%,基于读取的调用中有 93%),约有 4000 个 SV,可以被读取和组装同时捕获。然而,两种策略之间的不一致主要是由于复杂的 SV 和倒位引起的,这些是由于在这些位置上读取和组装的不一致对齐造成的。最后,在与医学相关基因的 SV 进行基准测试时,基于读取的策略在 5X 覆盖数据上的召回率达到 77%,而基于组装的策略需要 20X 覆盖数据才能达到类似的性能。因此,由于复杂的 SV 和倒位的检测不一致,建议将读取和组装的 SV 进行整合,用于通用检测,而基于组装的策略对于资源有限的应用程序是可选的。

相似文献

1
Comparison and benchmark of structural variants detected from long read and long-read assembly.长读与长读组装检测到的结构变异的比较和基准测试。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad188.
2
Robust Benchmark Structural Variant Calls of An Asian Using State-of-the-art Long-read Sequencing Technologies.利用最先进的长读测序技术对亚洲个体进行稳健的基准结构变异调用。
Genomics Proteomics Bioinformatics. 2022 Feb;20(1):192-204. doi: 10.1016/j.gpb.2020.10.006. Epub 2021 Mar 2.
3
svclassify: a method to establish benchmark structural variant calls.svclassify:一种建立基准结构变异调用的方法。
BMC Genomics. 2016 Jan 16;17:64. doi: 10.1186/s12864-016-2366-2.
4
Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data.基于比对和组装的方法在长读测序数据结构变异检测中的权衡。
Nat Commun. 2024 Mar 19;15(1):2447. doi: 10.1038/s41467-024-46614-z.
5
Combined use of Oxford Nanopore and Illumina sequencing yields insights into soybean structural variation biology.联合使用牛津纳米孔和 Illumina 测序技术揭示了大豆结构变异生物学的见解。
BMC Biol. 2022 Feb 23;20(1):53. doi: 10.1186/s12915-022-01255-w.
6
A Comparison of Structural Variant Calling from Short-Read and Nanopore-Based Whole-Genome Sequencing Using Optical Genome Mapping as a Benchmark.基于光学基因组图谱作为基准的短读长和纳米孔全基因组测序的结构变异调用比较。
Genes (Basel). 2024 Jul 16;15(7):925. doi: 10.3390/genes15070925.
7
Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.从长读序列组装和短读基因组测序技术中检测结构变异的预期和盲点。
Am J Hum Genet. 2021 May 6;108(5):919-928. doi: 10.1016/j.ajhg.2021.03.014. Epub 2021 Mar 30.
8
Benchmarking Oxford Nanopore read alignment-based insertion and deletion detection in crop plant genomes.基于牛津纳米孔测序reads 比对的插入缺失检测在作物基因组中的基准测试。
Plant Genome. 2023 Jun;16(2):e20314. doi: 10.1002/tpg2.20314. Epub 2023 Mar 29.
9
Automated filtering of genome-wide large deletions through an ensemble deep learning framework.通过集成深度学习框架自动筛选全基因组大片段缺失。
Methods. 2022 Oct;206:77-86. doi: 10.1016/j.ymeth.2022.08.001. Epub 2022 Aug 28.
10
SVsearcher: A more accurate structural variation detection method in long read data.SVsearcher:一种用于长读长数据中更准确的结构变异检测方法。
Comput Biol Med. 2023 May;158:106843. doi: 10.1016/j.compbiomed.2023.106843. Epub 2023 Mar 31.

引用本文的文献

1
ASVBM: Structural variant benchmarking with local joint analysis for multiple callsets.ASVBM:通过对多个数据集进行局部联合分析的结构变异基准测试
Comput Struct Biotechnol J. 2025 Jun 29;27:2851-2862. doi: 10.1016/j.csbj.2025.06.045. eCollection 2025.
2
Systematic benchmarking of tools for structural variation detection using short- and long-read sequencing data in pigs.利用猪的短读长和长读长测序数据对结构变异检测工具进行系统基准测试。
iScience. 2025 Feb 8;28(3):111983. doi: 10.1016/j.isci.2025.111983. eCollection 2025 Mar 21.
3
Highly accurate Korean draft genomes reveal structural variation highlighting human telomere evolution.
高度精确的韩国人基因组草图揭示了结构变异,突出了人类端粒的进化。
Nucleic Acids Res. 2025 Jan 7;53(1). doi: 10.1093/nar/gkae1294.
4
A Graph-based Goat Pangenome Reveals Structural Variations Involved in Domestication and Adaptation.基于图谱的山羊泛基因组揭示了与驯化和适应相关的结构变异。
Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae251.
5
Identification of osteoporosis genes using family studies.利用家系研究鉴定骨质疏松症基因。
Front Endocrinol (Lausanne). 2024 Oct 22;15:1455689. doi: 10.3389/fendo.2024.1455689. eCollection 2024.