Suppr超能文献

GASSST:全局比对短序列搜索工具。

GASSST: global alignment short sequence search tool.

机构信息

Univ-Rennes 1/IRISA, IRISA - Symbiose Campus universitaire de Beaulieu, 35042 Rennes Cedex, France.

出版信息

Bioinformatics. 2010 Oct 15;26(20):2534-40. doi: 10.1093/bioinformatics/btq485. Epub 2010 Aug 24.

Abstract

MOTIVATION

The rapid development of next-generation sequencing technologies able to produce huge amounts of sequence data is leading to a wide range of new applications. This triggers the need for fast and accurate alignment software. Common techniques often restrict indels in the alignment to improve speed, whereas more flexible aligners are too slow for large-scale applications. Moreover, many current aligners are becoming inefficient as generated reads grow ever larger. Our goal with our new aligner GASSST (Global Alignment Short Sequence Search Tool) is thus 2-fold-achieving high performance with no restrictions on the number of indels with a design that is still effective on long reads.

RESULTS

We propose a new efficient filtering step that discards most alignments coming from the seed phase before they are checked by the costly dynamic programming algorithm. We use a carefully designed series of filters of increasing complexity and efficiency to quickly eliminate most candidate alignments in a wide range of configurations. The main filter uses a precomputed table containing the alignment score of short four base words aligned against each other. This table is reused several times by a new algorithm designed to approximate the score of the full dynamic programming algorithm. We compare the performance of GASSST against BWA, BFAST, SSAHA2 and PASS. We found that GASSST achieves high sensitivity in a wide range of configurations and faster overall execution time than other state-of-the-art aligners.

AVAILABILITY

GASSST is distributed under the CeCILL software license at http://www.irisa.fr/symbiose/projects/gassst/

CONTACT

guillaume.rizk@irisa.fr; dominique.lavenier@irisa.fr

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

新一代测序技术能够产生大量的序列数据,其快速发展带来了广泛的新应用。这就需要快速准确的对齐软件。常见的技术通常在对齐时限制插入缺失,以提高速度,而更灵活的对齐器对于大规模应用来说太慢了。此外,随着生成的读取越来越大,许多当前的对齐器效率降低。因此,我们新的对齐器 GASSST(全局对齐短序列搜索工具)的目标是-在不对插入缺失数量进行限制的情况下实现高性能,同时设计仍然对长读取有效。

结果

我们提出了一种新的高效过滤步骤,在使用昂贵的动态编程算法检查之前,丢弃来自种子阶段的大多数对齐。我们使用一系列精心设计的、复杂度和效率逐渐提高的过滤器,快速消除了各种配置下的大多数候选对齐。主要过滤器使用一个预先计算的表,其中包含彼此对齐的四个碱基短字的对齐分数。该表由一个新算法重复使用几次,该算法旨在近似完整动态编程算法的分数。我们将 GASSST 与 BWA、BFAST、SSAHA2 和 PASS 的性能进行了比较。我们发现,GASSST 在广泛的配置下实现了高灵敏度,并且整体执行时间比其他最先进的对齐器更快。

可用性

GASSST 是根据 CeCILL 软件许可证在 http://www.irisa.fr/symbiose/projects/gassst/ 分发的。

联系信息

guillaume.rizk@irisa.frdominique.lavenier@irisa.fr

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b441/2951093/118b2622d2dc/btq485f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验