SeqAn是一个用于序列分析的高效、通用的C++库。

SeqAn an efficient, generic C++ library for sequence analysis.

作者信息

Döring Andreas, Weese David, Rausch Tobias, Reinert Knut

机构信息

Algorithmische Bioinformatik, Institut für Informatik, Takustr, 9, 14195 Berlin, Germany.

出版信息

BMC Bioinformatics. 2008 Jan 9;9:11. doi: 10.1186/1471-2105-9-11.

DOI:10.1186/1471-2105-9-11

PMID:18184432

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2246154/

Abstract

BACKGROUND

The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome 1 would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use.

RESULTS

To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use.

CONCLUSION

We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.

摘要

背景

新型算法技术的应用对于生命科学中的许多重要问题至关重要。例如，如果没有先进的组装算法，人类基因组测序就不可能实现。然而，由于技术进步的速度很快以及对生物信息学工具的迫切需求，最先进的算法技术与广泛使用的工具的实际算法组件之间的差距正在不断扩大。

结果

为了纠正这种趋势，我们建议使用SeqAn，这是一个用于计算生物学中序列分析的高效数据类型和算法库。SeqAn包含现有实用的最先进算法组件的实现，为算法测试和开发提供了坚实的基础。在本文中，我们描述了SeqAn的设计和内容，并通过两个例子展示了它的用法。在第一个例子中，我们通过比较不同的精确字符串匹配算法，展示了SeqAn作为实验平台的应用。第二个例子是用SeqAn重写的著名MUMmer工具的简化版本。结果表明，我们的实现非常高效且用途广泛。

结论

我们预计SeqAn通过提供一系列易于使用、设计良好的算法组件，极大地简化了新生物信息学工具的快速开发，这些组件对于序列分析领域至关重要。这不仅有助于新算法的实现，还能对现有算法进行合理的分析和比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/540c/2246154/d5ed72690c61/1471-2105-9-11-1.jpg

相似文献

SeqAn an efficient, generic C++ library for sequence analysis.

BMC Bioinformatics. 2008 Jan 9;9:11. doi: 10.1186/1471-2105-9-11.

The SeqAn C++ template library for efficient sequence analysis: A resource for programmers.

J Biotechnol. 2017 Nov 10;261:157-168. doi: 10.1016/j.jbiotec.2017.07.017. Epub 2017 Sep 6.

Segment-based multiple sequence alignment.

Bioinformatics. 2008 Aug 15;24(16):i187-92. doi: 10.1093/bioinformatics/btn281.

HotSwap for bioinformatics: a STRAP tutorial.

BMC Bioinformatics. 2006 Feb 9;7:64. doi: 10.1186/1471-2105-7-64.

A service-based BLAST command tool supported by cloud infrastructures.

Stud Health Technol Inform. 2012;175:69-77.

Vector NTI, a balanced all-in-one sequence analysis suite.

Brief Bioinform. 2004 Dec;5(4):378-88. doi: 10.1093/bib/5.4.378.

GLAD: a system for developing and deploying large-scale bioinformatics grid.

Bioinformatics. 2005 Mar;21(6):794-802. doi: 10.1093/bioinformatics/bti034. Epub 2004 Sep 23.

Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator.

BMC Bioinformatics. 2005 Apr 7;6:87. doi: 10.1186/1471-2105-6-87.

KDE Bioscience: platform for bioinformatics analysis workflows.

J Biomed Inform. 2006 Aug;39(4):440-50. doi: 10.1016/j.jbi.2005.09.001. Epub 2005 Oct 11.

HIVbase: a PC/Windows-based software offering storage and querying power for locally held HIV-1 genetic, experimental and clinical data.

Bioinformatics. 2004 Feb 12;20(3):436-8. doi: 10.1093/bioinformatics/btg445. Epub 2004 Jan 22.

引用本文的文献

Sequence diversity lost in early pregnancy.

Nature. 2025 May 21. doi: 10.1038/s41586-025-09031-w.

ChIP-seq Data Processing and Relative and Quantitative Signal Normalization for .

Bio Protoc. 2025 May 5;15(9):e5299. doi: 10.21769/BioProtoc.5299.

Young KRAB-zinc finger gene clusters are highly dynamic incubators of ERV-driven genetic heterogeneity in mice.

bioRxiv. 2025 Mar 2:2025.02.26.640358. doi: 10.1101/2025.02.26.640358.

QuickEd: high-performance exact sequence alignment based on bound-and-align.

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf112.

Meiosis-specific distal cohesion site decoupled from the kinetochore.

Nat Commun. 2025 Mar 3;16(1):2116. doi: 10.1038/s41467-025-57438-w.

Mapping the IscR regulon sheds light on the regulation of iron homeostasis in .

Front Microbiol. 2024 Sep 30;15:1463854. doi: 10.3389/fmicb.2024.1463854. eCollection 2024.

Nuclear dualism without extensive DNA elimination in the ciliate .

Proc Natl Acad Sci U S A. 2024 Sep 24;121(39):e2400503121. doi: 10.1073/pnas.2400503121. Epub 2024 Sep 19.

Genetic links between ovarian ageing, cancer risk and de novo mutation rates.

Nature. 2024 Sep;633(8030):608-614. doi: 10.1038/s41586-024-07931-x. Epub 2024 Sep 11.

Identification of tumor rejection antigens and the immunologic landscape of medulloblastoma.

Genome Med. 2024 Aug 19;16(1):102. doi: 10.1186/s13073-024-01363-y.

Meiosis-specific decoupling of the pericentromere from the kinetochore.

bioRxiv. 2024 Jul 22:2024.07.21.604490. doi: 10.1101/2024.07.21.604490.

本文引用的文献

Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics.

BMC Bioinformatics. 2006 Apr 4;7:188. doi: 10.1186/1471-2105-7-188.

libcov: a C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny.

BMC Bioinformatics. 2005 Jun 6;6:138. doi: 10.1186/1471-2105-6-138.

PatternHunter II: highly sensitive and fast homology search.

Genome Inform. 2003;14:164-75.

Mauve: multiple alignment of conserved genomic sequence with rearrangements.

Genome Res. 2004 Jul;14(7):1394-403. doi: 10.1101/gr.2289704.

Versatile and open software for comparing large genomes.

Genome Biol. 2004;5(2):R12. doi: 10.1186/gb-2004-5-2-r12. Epub 2004 Jan 30.

LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.

Genome Res. 2003 Apr;13(4):721-31. doi: 10.1101/gr.926603. Epub 2003 Mar 12.

The Bioperl toolkit: Perl modules for the life sciences.

Genome Res. 2002 Oct;12(10):1611-8. doi: 10.1101/gr.361602.

Efficient multiple genome alignment.

Bioinformatics. 2002;18 Suppl 1:S312-20. doi: 10.1093/bioinformatics/18.suppl_1.s312.

A comparison of whole-genome shotgun-derived mouse chromosome 16 and the human genome.

Science. 2002 May 31;296(5573):1661-71. doi: 10.1126/science.1069193.

The Bioinformatics Template Library--generic components for biocomputing.

Bioinformatics. 2001 Aug;17(8):729-37. doi: 10.1093/bioinformatics/17.8.729.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SeqAn是一个用于序列分析的高效、通用的C++库。

SeqAn an efficient, generic C++ library for sequence analysis.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献