• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用POASTA进行快速准确的间隙仿射偏序比对。

Fast and exact gap-affine partial order alignment with POASTA.

作者信息

van Dijk Lucas R, Manson Abigail L, Earl Ashlee M, Garimella Kiran V, Abeel Thomas

机构信息

Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, United States.

Delft Bioinformatics Lab, TU Delft, 2628 XE Delft, The Netherlands.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae757.

DOI:10.1093/bioinformatics/btae757
PMID:39752324
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11755094/
Abstract

MOTIVATION

Partial order alignment is a widely used method for computing multiple sequence alignments, with applications in genome assembly and pangenomics, among many others. Current algorithms to compute the optimal, gap-affine partial order alignment do not scale well to larger graphs and sequences. While heuristic approaches exist, they do not guarantee optimal alignment and sacrifice alignment accuracy.

RESULTS

We present POASTA, a new optimal algorithm for partial order alignment that exploits long stretches of matching sequence between the graph and a query. We benchmarked POASTA against the state-of-the-art on several diverse bacterial gene datasets and demonstrated an average speed-up of 4.1× and up to 9.8×, using less memory. POASTA's memory scaling characteristics enabled the construction of much larger POA graphs than previously possible, as demonstrated by megabase-length alignments of 342 Mycobacterium tuberculosis sequences.

AVAILABILITY AND IMPLEMENTATION

POASTA is available on Github at https://github.com/broadinstitute/poasta.

摘要

动机

偏序比对是一种广泛用于计算多序列比对的方法,在基因组组装和泛基因组学等众多领域都有应用。当前用于计算最优的、带间隙仿射的偏序比对的算法在处理更大的图和序列时扩展性不佳。虽然存在启发式方法,但它们不能保证最优比对,且会牺牲比对准确性。

结果

我们提出了POASTA,一种用于偏序比对的新的最优算法,该算法利用了图与查询序列之间的长匹配序列片段。我们在几个不同的细菌基因数据集上,将POASTA与现有最佳方法进行了基准测试,结果表明,它平均提速4.1倍,最高可达9.8倍,且使用的内存更少。POASTA的内存扩展特性使得构建比以前更大的POA图成为可能,342条结核分枝杆菌序列的兆碱基长度比对就证明了这一点。

可用性与实现

POASTA可在Github上获取,网址为https://github.com/broadinstitute/poasta。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/10933a2b4b08/btae757f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/81f3213ffe31/btae757f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/b5a69f561e5a/btae757f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/6841e208e495/btae757f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/933af7f57bab/btae757f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/10933a2b4b08/btae757f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/81f3213ffe31/btae757f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/b5a69f561e5a/btae757f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/6841e208e495/btae757f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/933af7f57bab/btae757f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9099/11755094/10933a2b4b08/btae757f5.jpg

相似文献

1
Fast and exact gap-affine partial order alignment with POASTA.使用POASTA进行快速准确的间隙仿射偏序比对。
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae757.
2
Optimal gap-affine alignment in O(s) space.最优间隙仿射对齐,时间复杂度为 O(s)。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad074.
3
Generating consensus sequences from partial order multiple sequence alignment graphs.从偏序多序列比对图生成一致序列。
Bioinformatics. 2003 May 22;19(8):999-1008. doi: 10.1093/bioinformatics/btg109.
4
Multiple sequence alignment using partial order graphs.使用偏序图的多序列比对。
Bioinformatics. 2002 Mar;18(3):452-64. doi: 10.1093/bioinformatics/18.3.452.
5
Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems.结合偏序比对和渐进多序列比对可提高比对速度,并增强对超大型比对问题的可扩展性。
Bioinformatics. 2004 Jul 10;20(10):1546-56. doi: 10.1093/bioinformatics/bth126. Epub 2004 Feb 12.
6
The tree alignment problem.树对齐问题。
BMC Bioinformatics. 2012 Nov 9;13:293. doi: 10.1186/1471-2105-13-293.
7
Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm.使用一种基于系统发育感知图算法的多重序列比对精确扩展方法。
Bioinformatics. 2012 Jul 1;28(13):1684-91. doi: 10.1093/bioinformatics/bts198. Epub 2012 Apr 23.
8
BWT construction and search at the terabase scale.万亿碱基规模下的BWT构建与搜索。
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae717.
9
QuickEd: high-performance exact sequence alignment based on bound-and-align.QuickEd:基于绑定与比对的高性能精确序列比对
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf112.
10
RecGraph: recombination-aware alignment of sequences to variation graphs.RecGraph:面向变异图的重组感知序列比对。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae292.

本文引用的文献

1
Building pangenome graphs.构建泛基因组图谱。
Nat Methods. 2024 Nov;21(11):2008-2012. doi: 10.1038/s41592-024-02430-3. Epub 2024 Oct 21.
2
HiPhase: jointly phasing small, structural, and tandem repeat variants from HiFi sequencing.HiPhase:从 HiFi 测序中联合相位小结构和串联重复变体。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae042.
3
Pangenome graph construction from genome alignments with Minigraph-Cactus.基于 Minigraph-Cactus 的基因组比对构建泛基因组图谱。
Nat Biotechnol. 2024 Apr;42(4):663-673. doi: 10.1038/s41587-023-01793-w. Epub 2023 May 10.
4
abPOA: an SIMD-based C library for fast partial order alignment using adaptive band.abPOA:一个基于 SIMD 的 C 库,用于使用自适应带实现快速偏序比对。
Bioinformatics. 2021 Aug 9;37(15):2209-2211. doi: 10.1093/bioinformatics/btaa963.
5
GraphAligner: rapid and versatile sequence-to-graph alignment.GraphAligner:快速且通用的序列到图的对齐方法。
Genome Biol. 2020 Sep 24;21(1):253. doi: 10.1186/s13059-020-02157-2.
6
Fast gap-affine pairwise alignment using the wavefront algorithm.基于波前算法的快速间隙亲和双序列比对。
Bioinformatics. 2021 May 1;37(4):456-463. doi: 10.1093/bioinformatics/btaa777.
7
Identification and Characterization of Genetic Determinants of Isoniazid and Rifampicin Resistance in Mycobacterium tuberculosis in Southern India.印度南部结核分枝杆菌异烟肼和利福平耐药的遗传决定因素的鉴定和特征分析。
Sci Rep. 2019 Jul 16;9(1):10283. doi: 10.1038/s41598-019-46756-x.
8
Multi-platform discovery of haplotype-resolved structural variation in human genomes.多平台发现人类基因组中单体型分辨率结构变异。
Nat Commun. 2019 Apr 16;10(1):1784. doi: 10.1038/s41467-018-08148-z.
9
Superbubbles revisited.再探超级气泡
Algorithms Mol Biol. 2018 Dec 1;13:16. doi: 10.1186/s13015-018-0134-3. eCollection 2018.
10
Accurate detection of complex structural variations using single-molecule sequencing.利用单分子测序技术准确检测复杂结构变异。
Nat Methods. 2018 Jun;15(6):461-468. doi: 10.1038/s41592-018-0001-7. Epub 2018 Apr 30.