Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia.
Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia.
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):306. doi: 10.1186/s12859-020-03590-7.
Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data.
In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets.
Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.
基于图的基因组组装表示形式最近已在不同的上下文中得到应用 - 从改进质粒序列的重建和改进对宏基因组数据的分析,到读取错误纠正和无参考单倍型重建。虽然这些应用中的许多都大量利用了长核苷酸序列到组装图的比对,但最近才发布了用于查找此类比对的第一个通用软件工具,并且其缺陷和局限性仍有待发现。此外,现有的工具无法进行氨基酸序列的比对,这在各种情况下可能很有用 - 特别是在对宏基因组测序数据的分析中。
在这项工作中,我们提出了一种新颖的 SPAligner(圣彼得堡比对器)工具,用于将长距离分歧的核苷酸和氨基酸序列比对到组装图上。我们证明 SPAligner 是一种有效的解决方案,可将第三代测序读段映射到各种复杂程度的组装图上,还展示了它如何帮助识别复杂宏基因组数据集中的已知基因。
我们的工作将有助于加速基于图的方法在解决序列到基因组组装比对问题方面的发展。SPAligner 作为 SPAdes 工具库的一部分实现,并可在 Github 上获得。