Suppr超能文献

SvABA:通过局部组装进行全基因组结构变异和插入缺失的检测。

SvABA: genome-wide detection of structural variants and indels by local assembly.

机构信息

The Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA.

Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA.

出版信息

Genome Res. 2018 Apr;28(4):581-591. doi: 10.1101/gr.221028.117. Epub 2018 Mar 13.

Abstract

Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA's performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ∼4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs.

摘要

结构变异(SVs),包括小的插入和缺失变异(indels),通过标准的基于比对的变异调用方法很难检测到。序列组装提供了一种识别 SVs 的强大方法,但由于其计算复杂性和从组装 contigs 中提取 SVs 的困难,难以在全基因组范围内大规模应用于 SV 检测。我们描述了 SvABA,这是一种使用全基因组局部组装的高效且准确的方法,用于从短读测序数据中检测 SVs,该方法对内存和计算要求低。我们在人类基因组 NA12878 以及模拟和真实癌症基因组上评估了 SvABA 的性能。SvABA 在广泛的 SV 范围内表现出优异的灵敏度和特异性,与现有方法相比,大大提高了 20-300 bp 范围内变异的检测性能。SvABA 还可以识别具有来自远距离基因组区域的短(<1000 bp)模板序列插入的复杂体细胞重排。我们将 SvABA 应用于 11 种癌症类型的 344 个癌症基因组,发现短模板序列插入发生在所有体细胞重排的约 4%。最后,我们证明 SvABA 可以识别包含中等大小(50-300 bp)SVs 的病毒整合和癌症驱动改变的位点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca44/5880247/8e1da77661bd/581_F1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验