Institute of Applied Computer Science, Lodz University of Technology, Lódz, Poland.
Faculty of Information Technology, Czech Technical University in Prague, Czechia.
Bioinformatics. 2018 Dec 15;34(24):4290-4292. doi: 10.1093/bioinformatics/bty506.
The many thousands of high-quality genomes available now-a-days imply a shift from single genome to pan-genomic analyses. A basic algorithmic building brick for such a scenario is online search over a collection of similar texts, a problem with surprisingly few solutions presented so far.
We present SOPanG, a simple tool for exact pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome. Thanks to bit-parallelism, it achieves pattern matching speeds above 400 MB/s, more than an order of magnitude higher than of other software.
SOPanG is available for free from: https://github.com/MrAlexSee/sopang.
Supplementary data are available at Bioinformatics online.
如今,成千上万的高质量基因组意味着从单个基因组分析向泛基因组分析的转变。这种情况下的基本算法构建模块是对相似文本集合进行在线搜索,到目前为止,提出的解决方案很少。
我们提出了 SOPanG,这是一种用于弹性退化字符串上精确模式匹配的简单工具,这是最近提出的泛基因组简化模型。由于位并行处理,它实现了超过 400MB/s 的模式匹配速度,比其他软件高出一个数量级以上。
SOPanG 可从以下网址免费获得:https://github.com/MrAlexSee/sopang。
补充数据可在 Bioinformatics 在线获得。