• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在泛基因组规模上生成多个比对。

Generating multiple alignments on a pangenomic scale.

作者信息

Olbrich Jannik, Büchler Thomas, Ohlebusch Enno

机构信息

Institute of Theoretical Computer Science, Ulm University, Ulm, 89069, Germany.

出版信息

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf104.

DOI:10.1093/bioinformatics/btaf104
PMID:40097267
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11928754/
Abstract

MOTIVATION

Since novel long read sequencing technologies allow for de novo assembly of many individuals of a species, high-quality assemblies are becoming widely available. For example, the recently published draft human pangenome reference was based on assemblies composed of contigs. There is an urgent need for a software-tool that is able to generate a multiple alignment of genomes of the same species because current multiple sequence alignment programs cannot deal with such a volume of data.

RESULTS

We show that the combination of a well-known anchor-based method with the technique of prefix-free parsing yields an approach that is able to generate multiple alignments on a pangenomic scale, provided that large-scale structural variants are rare. Furthermore, experiments with real world data show that our software tool PANgenomic Anchor-based Multiple Alignment significantly outperforms current state-of-the art programs.

AVAILABILITY AND IMPLEMENTATION

Source code is available at: https://gitlab.com/qwerzuiop/panama, archived at swh:1:dir:e90c9f664995acca9063245cabdd97549cf39694.

摘要

动机

由于新型长读长测序技术允许对一个物种的多个个体进行从头组装,高质量的组装结果正变得广泛可用。例如,最近发布的人类泛基因组参考草图就是基于由重叠群组成的组装。迫切需要一种能够生成同一物种基因组多序列比对的软件工具,因为当前的多序列比对程序无法处理如此大量的数据。

结果

我们表明,将一种著名的基于锚定的方法与无前缀解析技术相结合,能产生一种能够在泛基因组规模上生成多序列比对的方法,前提是大规模结构变异很少见。此外,对真实世界数据的实验表明,我们的软件工具基于泛基因组锚定的多序列比对显著优于当前的最先进程序。

可用性与实现

源代码可在以下网址获取:https://gitlab.com/qwerzuiop/panama,存档于swh:1:dir:e90c9f664995acca9063245cabdd97549cf39694。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/90d44272127b/btaf104f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/1f9538b1698c/btaf104f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/f521b092680d/btaf104f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/d3ec384b70aa/btaf104f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/1f0cb4523c4c/btaf104f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/90d44272127b/btaf104f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/1f9538b1698c/btaf104f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/f521b092680d/btaf104f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/d3ec384b70aa/btaf104f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/1f0cb4523c4c/btaf104f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd12/11928754/90d44272127b/btaf104f5.jpg

相似文献

1
Generating multiple alignments on a pangenomic scale.在泛基因组规模上生成多个比对。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf104.
2
AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references.AlignGraph:一种基于密切相关参考序列指导的二级从头基因组组装算法。
Bioinformatics. 2014 Jun 15;30(12):i319-i328. doi: 10.1093/bioinformatics/btu291.
3
ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers.ARKS:基于链接读取子的人类基因组草图染色体级 scaffolding。
BMC Bioinformatics. 2018 Jun 20;19(1):234. doi: 10.1186/s12859-018-2243-x.
4
wgatools: an ultrafast toolkit for manipulating whole-genome alignments.wgatools:一个用于操作全基因组比对的超快速工具包。
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf132.
5
Mugsy: fast multiple alignment of closely related whole genomes.Mugsy:快速比对密切相关的整个基因组。
Bioinformatics. 2011 Feb 1;27(3):334-42. doi: 10.1093/bioinformatics/btq665. Epub 2010 Dec 9.
6
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
7
Long read alignment based on maximal exact match seeds.基于最大精确匹配种子的长读比对。
Bioinformatics. 2012 Sep 15;28(18):i318-i324. doi: 10.1093/bioinformatics/bts414.
8
chainCleaner improves genome alignment specificity and sensitivity.链清洁器提高了基因组比对的特异性和灵敏度。
Bioinformatics. 2017 Jun 1;33(11):1596-1603. doi: 10.1093/bioinformatics/btx024.
9
ntLink: A Toolkit for De Novo Genome Assembly Scaffolding and Mapping Using Long Reads.ntLink:一种使用长读长进行从头基因组组装支架和映射的工具包。
Curr Protoc. 2023 Apr;3(4):e733. doi: 10.1002/cpz1.733.
10
CAREx: context-aware read extension of paired-end sequencing data.CAREx:基于上下文感知的配对末端测序数据扩展。
BMC Bioinformatics. 2024 May 10;25(1):186. doi: 10.1186/s12859-024-05802-w.

引用本文的文献

1
Partitioned Multi-MUM finding for scalable pangenomics.用于可扩展全基因组学的分区多MUM查找
bioRxiv. 2025 May 25:2025.05.20.654611. doi: 10.1101/2025.05.20.654611.

本文引用的文献

1
Building pangenome graphs.构建泛基因组图谱。
Nat Methods. 2024 Nov;21(11):2008-2012. doi: 10.1038/s41592-024-02430-3. Epub 2024 Oct 21.
2
The variation and evolution of complete human centromeres.人类完整着丝粒的变异与进化。
Nature. 2024 May;629(8010):136-145. doi: 10.1038/s41586-024-07278-3. Epub 2024 Apr 3.
3
FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets.FMAlign2:一种新颖的快速多核苷酸序列比对方法,适用于超大数据集。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btae014.
4
Efficient short read mapping to a pangenome that is represented by a graph of ED strings.高效的短读映射到由 ED 字符串图表示的泛基因组。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad320.
5
A draft human pangenome reference.人类泛基因组参考草图。
Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. Epub 2023 May 10.
6
Pangenome graph construction from genome alignments with Minigraph-Cactus.基于 Minigraph-Cactus 的基因组比对构建泛基因组图谱。
Nat Biotechnol. 2024 Apr;42(4):663-673. doi: 10.1038/s41587-023-01793-w. Epub 2023 May 10.
7
Computational graph pangenomics: a tutorial on data structures and their applications.计算图泛基因组学:数据结构及其应用教程
Nat Comput. 2022 Mar;21(1):81-108. doi: 10.1007/s11047-022-09882-6. Epub 2022 Mar 4.
8
HAlign 3: Fast Multiple Alignment of Ultra-Large Numbers of Similar DNA/RNA Sequences.HAlign 3:快速对齐超大量相似 DNA/RNA 序列。
Mol Biol Evol. 2022 Aug 3;39(8). doi: 10.1093/molbev/msac166.
9
PHONI: Streamed Matching Statistics with Multi-Genome References.PHONI:多基因组参考的流式匹配统计
Proc Data Compress Conf. 2021 Mar;2021:193-202. doi: 10.1109/dcc50243.2021.00027. Epub 2021 May 10.
10
New strategies to improve minimap2 alignment accuracy.提高 minimap2 比对准确性的新策略。
Bioinformatics. 2021 Dec 7;37(23):4572-4574. doi: 10.1093/bioinformatics/btab705.