Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan.
Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo 135-0064, Japan.
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae517.
Spaln is the earliest practical tool for self-sufficient genome mapping and spliced alignment of protein query sequences onto a mammalian-sized eukaryotic genomic sequence. However, its computational speed has become inadequate for the analysis of rapidly growing genomic and transcript sequence data.
The dynamic programming calculation of Spaln has been sped up in two ways: (i) the introduction of the multi-intermediate unidirectional Hirschberg method and (ii) SIMD-based vectorization. The new version, Spaln3, is ∼7 times faster than the latest Spaln version 2, and its gene prediction accuracy is consistently higher than that of Miniprot.
Spaln 是最早用于自给自足的基因组图谱绘制和蛋白质查询序列与哺乳动物大小的真核基因组序列拼接比对的实用工具。然而,其计算速度已经不足以分析快速增长的基因组和转录序列数据。
Spaln 的动态规划计算通过两种方式得到了加速:(i)引入多中间单向 Hirschberg 方法和(ii)基于 SIMD 的向量化。新版本 Spaln3 比最新的 Spaln 版本 2 快约 7 倍,其基因预测准确性始终高于 Miniprot。