Suppr超能文献

SVseq:一种利用低覆盖度测序数据检测缺失精确断点的方法。

SVseq: an approach for detecting exact breakpoints of deletions with low-coverage sequence data.

机构信息

Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA.

出版信息

Bioinformatics. 2011 Dec 1;27(23):3228-34. doi: 10.1093/bioinformatics/btr563. Epub 2011 Oct 12.

Abstract

MOTIVATION

Structural variation (SV), such as deletion, is an important type of genetic variation and may be associated with diseases. While there are many existing methods for detecting SVs, finding deletions is still challenging with low-coverage short sequence reads. Existing deletion finding methods for sequence reads either use the so-called split reads mapping for detecting deletions with exact breakpoints, or rely on discordant insert sizes to estimate approximate positions of deletions. Neither is completely satisfactory with low-coverage sequence reads.

RESULTS

We present SVseq, an efficient two-stage approach, which combines the split reads mapping and discordant insert size analysis. The first stage is split reads mapping based on the Burrows-Wheeler transform (BWT), which finds candidate deletions. Our split reads mapping method allows mismatches and small indels, thus deletions near other small variations can be discovered and reads with sequencing errors can be utilized. The second stage filters the false positives by analyzing discordant insert sizes. SVseq is more accurate than an alternative approach when applying on simulated data and empirical data, and is also much faster.

AVAILABILITY

The program SVseq can be downloaded at http://www.engr.uconn.edu/~jiz08001/

CONTACT

jinzhang@engr.uconn.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结构变异(SV),如缺失,是一种重要的遗传变异类型,可能与疾病有关。虽然已经有许多现有的方法用于检测 SV,但在低覆盖率的短序列读长下,发现缺失仍然具有挑战性。现有的用于序列读长的缺失发现方法要么使用所谓的拆分读长映射来检测具有精确断点的缺失,要么依赖于不一致的插入大小来估计缺失的大致位置。在低覆盖率的序列读长下,这两种方法都不完全令人满意。

结果

我们提出了 SVseq,这是一种高效的两阶段方法,结合了拆分读长映射和不一致的插入大小分析。第一阶段是基于 Burrows-Wheeler 变换(BWT)的拆分读长映射,用于寻找候选缺失。我们的拆分读长映射方法允许错配和小的插入缺失,因此可以发现靠近其他小变异的缺失,并且可以利用具有测序错误的读长。第二阶段通过分析不一致的插入大小来过滤假阳性。在应用于模拟数据和真实数据时,SVseq 比替代方法更准确,而且速度也快得多。

可用性

程序 SVseq 可在 http://www.engr.uconn.edu/~jiz08001/ 下载。

联系方式

jinzhang@engr.uconn.edu

补充信息

补充数据可在 Bioinformatics 在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验