Suppr超能文献

SSW 库:一个用于基因组应用的 SIMD Smith-Waterman C/C++ 库。

SSW library: an SIMD Smith-Waterman C/C++ library for use in genomic applications.

机构信息

Department of Biology, Boston College, Chestnut Hill, Massachusetts, United States of America.

出版信息

PLoS One. 2013 Dec 4;8(12):e82138. doi: 10.1371/journal.pone.0082138. eCollection 2013.

Abstract

BACKGROUND

The Smith-Waterman algorithm, which produces the optimal pairwise alignment between two sequences, is frequently used as a key component of fast heuristic read mapping and variation detection tools for next-generation sequencing data. Though various fast Smith-Waterman implementations are developed, they are either designed as monolithic protein database searching tools, which do not return detailed alignment, or are embedded into other tools. These issues make reusing these efficient Smith-Waterman implementations impractical.

RESULTS

To facilitate easy integration of the fast Single-Instruction-Multiple-Data Smith-Waterman algorithm into third-party software, we wrote a C/C++ library, which extends Farrar's Striped Smith-Waterman (SSW) to return alignment information in addition to the optimal Smith-Waterman score. In this library we developed a new method to generate the full optimal alignment results and a suboptimal score in linear space at little cost of efficiency. This improvement makes the fast Single-Instruction-Multiple-Data Smith-Waterman become really useful in genomic applications. SSW is available both as a C/C++ software library, as well as a stand-alone alignment tool at: https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library.

CONCLUSIONS

The SSW library has been used in the primary read mapping tool MOSAIK, the split-read mapping program SCISSORS, the MEI detector TANGRAM, and the read-overlap graph generation program RZMBLR. The speeds of the mentioned software are improved significantly by replacing their ordinary Smith-Waterman or banded Smith-Waterman module with the SSW Library.

摘要

背景

Smith-Waterman 算法可生成两个序列之间的最优两两比对,常用于下一代测序数据的快速启发式读映射和变异检测工具的关键组件。尽管已经开发了各种快速 Smith-Waterman 实现,但它们要么设计为不返回详细比对的整体蛋白质数据库搜索工具,要么嵌入到其他工具中。这些问题使得重用这些高效的 Smith-Waterman 实现变得不切实际。

结果

为了方便将快速单指令多数据流 Smith-Waterman 算法轻松集成到第三方软件中,我们编写了一个 C/C++ 库,该库扩展了 Farrar 的 Striped Smith-Waterman (SSW),除了最优 Smith-Waterman 得分外,还返回对齐信息。在这个库中,我们开发了一种新方法,以线性空间的成本生成完整的最优对齐结果和次优得分。这种改进使得快速单指令多数据流 Smith-Waterman 在基因组应用中真正有用。SSW 既可以作为 C/C++软件库使用,也可以作为独立的对齐工具在 https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library 上使用。

结论

SSW 库已在主要读映射工具 MOSAIK、分割读映射程序 SCISSORS、MEI 检测器 TANGRAM 和读重叠图生成程序 RZMBLR 中使用。通过用 SSW 库替换它们的普通 Smith-Waterman 或带 Smith-Waterman 模块,提到的软件的速度得到了显著提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2cf1/3852983/09beb4c6c9cd/pone.0082138.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验