Suppr超能文献

Sprites:通过重新比对分裂读段从测序数据中检测缺失。

Sprites: detection of deletions from sequencing data by re-aligning split reads.

作者信息

Zhang Zhen, Wang Jianxin, Luo Junwei, Ding Xiaojun, Zhong Jiancheng, Wang Jun, Wu Fang-Xiang, Pan Yi

机构信息

School of Information Science and Engineering, Central South University, Changsha, 410083, China, College of Information and Communication Engineering, Hunan Institute of Science and Technology, Yueyang, 414006, China.

School of Information Science and Engineering, Central South University, Changsha, 410083, China.

出版信息

Bioinformatics. 2016 Jun 15;32(12):1788-96. doi: 10.1093/bioinformatics/btw053. Epub 2016 Feb 1.

Abstract

MOTIVATION

Advances of next generation sequencing technologies and availability of short read data enable the detection of structural variations (SVs). Deletions, an important type of SVs, have been suggested in association with genetic diseases. There are three types of deletions: blunt deletions, deletions with microhomologies and deletions with microsinsertions. The last two types are very common in the human genome, but they pose difficulty for the detection. Furthermore, finding deletions from sequencing data remains challenging. It is highly appealing to develop sensitive and accurate methods to detect deletions from sequencing data, especially deletions with microhomology and deletions with microinsertion.

RESULTS

We present a novel method called Sprites (SPlit Read re-alIgnment To dEtect Structural variants) which finds deletions from sequencing data. It aligns a whole soft-clipping read rather than its clipped part to the target sequence, a segment of the reference which is determined by spanning reads, in order to find the longest prefix or suffix of the read that has a match in the target sequence. This alignment aims to solve the problem of deletions with microhomologies and deletions with microinsertions. Using both simulated and real data we show that Sprites performs better on detecting deletions compared with other current methods in terms of F-score.

AVAILABILITY AND IMPLEMENTATION

Sprites is open source software and freely available at https://github.com/zhangzhen/sprites

CONTACT

jxwang@mail.csu.edu.cnSupplementary data: Supplementary data are available at Bioinformatics online.

摘要

动机

新一代测序技术的进步以及短读长数据的可得性使得结构变异(SVs)的检测成为可能。缺失作为SVs的一种重要类型,已被认为与遗传疾病有关。有三种类型的缺失:平端缺失、具有微同源性的缺失和具有微插入的缺失。后两种类型在人类基因组中非常常见,但它们的检测存在困难。此外,从测序数据中发现缺失仍然具有挑战性。开发灵敏且准确的方法从测序数据中检测缺失,尤其是具有微同源性的缺失和具有微插入的缺失,极具吸引力。

结果

我们提出了一种名为Sprites(通过分割读段重新比对来检测结构变异)的新方法,该方法可从测序数据中发现缺失。它将整个软剪切读段而非其剪切部分与目标序列(由跨越读段确定的参考序列片段)进行比对,以找到读段中在目标序列中有匹配的最长前缀或后缀。这种比对旨在解决具有微同源性的缺失和具有微插入的缺失问题。使用模拟数据和真实数据,我们表明在F值方面,Sprites在检测缺失方面比其他现有方法表现更好。

可用性与实现

Sprites是开源软件,可在https://github.com/zhangzhen/sprites上免费获取。

联系方式

jxwang@mail.csu.edu.cn补充数据:补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验