Suppr超能文献

Spaln3:提高基因组作图和蛋白质查询序列拼接比对的速度和准确性。

Spaln3: improvement in speed and accuracy of genome mapping and spliced alignment of protein query sequences.

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8562, Japan.

Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Koto-ku, Tokyo 135-0064, Japan.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae517.

Abstract

MOTIVATION

Spaln is the earliest practical tool for self-sufficient genome mapping and spliced alignment of protein query sequences onto a mammalian-sized eukaryotic genomic sequence. However, its computational speed has become inadequate for the analysis of rapidly growing genomic and transcript sequence data.

RESULTS

The dynamic programming calculation of Spaln has been sped up in two ways: (i) the introduction of the multi-intermediate unidirectional Hirschberg method and (ii) SIMD-based vectorization. The new version, Spaln3, is ∼7 times faster than the latest Spaln version 2, and its gene prediction accuracy is consistently higher than that of Miniprot.

AVAILABILITY AND IMPLEMENTATION

https://github.com/ogotoh/spaln.

摘要

动机

Spaln 是最早用于自给自足的基因组图谱绘制和蛋白质查询序列与哺乳动物大小的真核基因组序列拼接比对的实用工具。然而,其计算速度已经不足以分析快速增长的基因组和转录序列数据。

结果

Spaln 的动态规划计算通过两种方式得到了加速:(i)引入多中间单向 Hirschberg 方法和(ii)基于 SIMD 的向量化。新版本 Spaln3 比最新的 Spaln 版本 2 快约 7 倍,其基因预测准确性始终高于 Miniprot。

可用性和实现

https://github.com/ogotoh/spaln。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0bd/11361809/fe4f31fde461/btae517f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验