SVseq：一种利用低覆盖度测序数据检测缺失精确断点的方法。

SVseq: an approach for detecting exact breakpoints of deletions with low-coverage sequence data.

机构信息

Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA.

出版信息

Bioinformatics. 2011 Dec 1;27(23):3228-34. doi: 10.1093/bioinformatics/btr563. Epub 2011 Oct 12.

DOI:10.1093/bioinformatics/btr563

PMID:21994222

Abstract

MOTIVATION

Structural variation (SV), such as deletion, is an important type of genetic variation and may be associated with diseases. While there are many existing methods for detecting SVs, finding deletions is still challenging with low-coverage short sequence reads. Existing deletion finding methods for sequence reads either use the so-called split reads mapping for detecting deletions with exact breakpoints, or rely on discordant insert sizes to estimate approximate positions of deletions. Neither is completely satisfactory with low-coverage sequence reads.

RESULTS

We present SVseq, an efficient two-stage approach, which combines the split reads mapping and discordant insert size analysis. The first stage is split reads mapping based on the Burrows-Wheeler transform (BWT), which finds candidate deletions. Our split reads mapping method allows mismatches and small indels, thus deletions near other small variations can be discovered and reads with sequencing errors can be utilized. The second stage filters the false positives by analyzing discordant insert sizes. SVseq is more accurate than an alternative approach when applying on simulated data and empirical data, and is also much faster.

AVAILABILITY

The program SVseq can be downloaded at http://www.engr.uconn.edu/~jiz08001/

CONTACT

jinzhang@engr.uconn.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结构变异（SV），如缺失，是一种重要的遗传变异类型，可能与疾病有关。虽然已经有许多现有的方法用于检测 SV，但在低覆盖率的短序列读长下，发现缺失仍然具有挑战性。现有的用于序列读长的缺失发现方法要么使用所谓的拆分读长映射来检测具有精确断点的缺失，要么依赖于不一致的插入大小来估计缺失的大致位置。在低覆盖率的序列读长下，这两种方法都不完全令人满意。

结果

我们提出了 SVseq，这是一种高效的两阶段方法，结合了拆分读长映射和不一致的插入大小分析。第一阶段是基于 Burrows-Wheeler 变换（BWT）的拆分读长映射，用于寻找候选缺失。我们的拆分读长映射方法允许错配和小的插入缺失，因此可以发现靠近其他小变异的缺失，并且可以利用具有测序错误的读长。第二阶段通过分析不一致的插入大小来过滤假阳性。在应用于模拟数据和真实数据时，SVseq 比替代方法更准确，而且速度也快得多。

可用性

程序 SVseq 可在 http://www.engr.uconn.edu/~jiz08001/ 下载。

联系方式

jinzhang@engr.uconn.edu

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

SVseq: an approach for detecting exact breakpoints of deletions with low-coverage sequence data.

Bioinformatics. 2011 Dec 1;27(23):3228-34. doi: 10.1093/bioinformatics/btr563. Epub 2011 Oct 12.

An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data.

BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S6. doi: 10.1186/1471-2105-13-S6-S6.

Sprites: detection of deletions from sequencing data by re-aligning split reads.

Bioinformatics. 2016 Jun 15;32(12):1788-96. doi: 10.1093/bioinformatics/btw053. Epub 2016 Feb 1.

PRISM: pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants.

Bioinformatics. 2012 Oct 15;28(20):2576-83. doi: 10.1093/bioinformatics/bts484. Epub 2012 Jul 31.

Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms.

Brief Bioinform. 2016 Jan;17(1):51-62. doi: 10.1093/bib/bbv028. Epub 2015 May 20.

Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS.

Bioinformatics. 2012 Mar 1;28(5):619-27. doi: 10.1093/bioinformatics/bts019. Epub 2012 Jan 11.

Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data.

Methods. 2017 Oct 1;129:24-32. doi: 10.1016/j.ymeth.2017.08.005. Epub 2017 Aug 10.

Identification of genomic indels and structural variations using split reads.

BMC Genomics. 2011 Jul 25;12:375. doi: 10.1186/1471-2164-12-375.

Seeksv: an accurate tool for somatic structural variation and virus integration detection.

Bioinformatics. 2017 Jan 15;33(2):184-191. doi: 10.1093/bioinformatics/btw591. Epub 2016 Sep 14.

The challenge of detecting indels in bacterial genomes from short-read sequencing data.

J Biotechnol. 2017 May 20;250:11-15. doi: 10.1016/j.jbiotec.2017.02.026. Epub 2017 Mar 4.

引用本文的文献

SVseq discloses the genomic complexity of different prenatal, de novo, apparently balanced chromosome rearrangements detected by CMA and karyotype.

Ital J Pediatr. 2025 May 24;51(1):155. doi: 10.1186/s13052-025-01987-9.

High-resolution mapping reveals the mechanism and contribution of genome insertions and deletions to RNA virus evolution.

Proc Natl Acad Sci U S A. 2023 Aug;120(31):e2304667120. doi: 10.1073/pnas.2304667120. Epub 2023 Jul 24.

Efficient collection of a large number of mutations by mutagenesis of DNA damage response defective animals.

Sci Rep. 2021 Apr 7;11(1):7630. doi: 10.1038/s41598-021-87226-7.

Somatic structural variation targets neurodevelopmental genes and identifies as a tumor suppressor in neuroblastoma.

Genome Res. 2020 Sep;30(9):1228-1242. doi: 10.1101/gr.252106.119. Epub 2020 Aug 13.

Detection of Genomic Structural Variants from Next-Generation Sequencing Data.

Front Bioeng Biotechnol. 2015 Jun 25;3:92. doi: 10.3389/fbioe.2015.00092. eCollection 2015.

Structural variation discovery in the cancer genome using next generation sequencing: computational solutions and perspectives.

Oncotarget. 2015 Mar 20;6(8):5477-89. doi: 10.18632/oncotarget.3491.

Identification of copy number variants in whole-genome data using Reference Coverage Profiles.

Front Genet. 2015 Feb 17;6:45. doi: 10.3389/fgene.2015.00045. eCollection 2015.

GINDEL: accurate genotype calling of insertions and deletions from low coverage population sequence reads.

PLoS One. 2014 Nov 25;9(11):e113324. doi: 10.1371/journal.pone.0113324. eCollection 2014.

Vindel: a simple pipeline for checking indel redundancy.

BMC Bioinformatics. 2014 Nov 19;15(1):359. doi: 10.1186/s12859-014-0359-1.

Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives.

BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S1. doi: 10.1186/1471-2105-14-S11-S1. Epub 2013 Sep 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SVseq：一种利用低覆盖度测序数据检测缺失精确断点的方法。

SVseq: an approach for detecting exact breakpoints of deletions with low-coverage sequence data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性

联系方式

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献