Suppr超能文献

使用POASTA进行快速准确的间隙仿射偏序比对。

Fast and exact gap-affine partial order alignment with POASTA.

作者信息

van Dijk Lucas R, Manson Abigail L, Earl Ashlee M, Garimella Kiran V, Abeel Thomas

机构信息

Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, United States.

Delft Bioinformatics Lab, TU Delft, 2628 XE Delft, The Netherlands.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae757.

Abstract

MOTIVATION

Partial order alignment is a widely used method for computing multiple sequence alignments, with applications in genome assembly and pangenomics, among many others. Current algorithms to compute the optimal, gap-affine partial order alignment do not scale well to larger graphs and sequences. While heuristic approaches exist, they do not guarantee optimal alignment and sacrifice alignment accuracy.

RESULTS

We present POASTA, a new optimal algorithm for partial order alignment that exploits long stretches of matching sequence between the graph and a query. We benchmarked POASTA against the state-of-the-art on several diverse bacterial gene datasets and demonstrated an average speed-up of 4.1× and up to 9.8×, using less memory. POASTA's memory scaling characteristics enabled the construction of much larger POA graphs than previously possible, as demonstrated by megabase-length alignments of 342 Mycobacterium tuberculosis sequences.

AVAILABILITY AND IMPLEMENTATION

POASTA is available on Github at https://github.com/broadinstitute/poasta.

摘要

动机

偏序比对是一种广泛用于计算多序列比对的方法,在基因组组装和泛基因组学等众多领域都有应用。当前用于计算最优的、带间隙仿射的偏序比对的算法在处理更大的图和序列时扩展性不佳。虽然存在启发式方法,但它们不能保证最优比对,且会牺牲比对准确性。

结果

我们提出了POASTA,一种用于偏序比对的新的最优算法,该算法利用了图与查询序列之间的长匹配序列片段。我们在几个不同的细菌基因数据集上,将POASTA与现有最佳方法进行了基准测试,结果表明,它平均提速4.1倍,最高可达9.8倍,且使用的内存更少。POASTA的内存扩展特性使得构建比以前更大的POA图成为可能,342条结核分枝杆菌序列的兆碱基长度比对就证明了这一点。

可用性与实现

POASTA可在Github上获取,网址为https://github.com/broadinstitute/poasta。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/81f3213ffe31/btae757f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验