• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SPAligner:将长距离分化的分子序列比对到组装图谱上。

SPAligner: alignment of long diverged molecular sequences to assembly graphs.

机构信息

Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia.

Department of Statistical Modelling, St. Petersburg State University, St. Petersburg, Russia.

出版信息

BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):306. doi: 10.1186/s12859-020-03590-7.

DOI:10.1186/s12859-020-03590-7
PMID:32703258
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7379835/
Abstract

BACKGROUND

Graph-based representation of genome assemblies has been recently used in different contexts - from improved reconstruction of plasmid sequences and refined analysis of metagenomic data to read error correction and reference-free haplotype reconstruction. While many of these applications heavily utilize the alignment of long nucleotide sequences to assembly graphs, first general-purpose software tools for finding such alignments have been released only recently and their deficiencies and limitations are yet to be discovered. Moreover, existing tools can not perform alignment of amino acid sequences, which could prove useful in various contexts - in particular the analysis of metagenomic sequencing data.

RESULTS

In this work we present a novel SPAligner (Saint-Petersburg Aligner) tool for aligning long diverged nucleotide and amino acid sequences to assembly graphs. We demonstrate that SPAligner is an efficient solution for mapping third generation sequencing reads onto assembly graphs of various complexity and also show how it can facilitate the identification of known genes in complex metagenomic datasets.

CONCLUSIONS

Our work will facilitate accelerating the development of graph-based approaches in solving sequence to genome assembly alignment problem. SPAligner is implemented as a part of SPAdes tools library and is available on Github.

摘要

背景

基于图的基因组组装表示形式最近已在不同的上下文中得到应用 - 从改进质粒序列的重建和改进对宏基因组数据的分析,到读取错误纠正和无参考单倍型重建。虽然这些应用中的许多都大量利用了长核苷酸序列到组装图的比对,但最近才发布了用于查找此类比对的第一个通用软件工具,并且其缺陷和局限性仍有待发现。此外,现有的工具无法进行氨基酸序列的比对,这在各种情况下可能很有用 - 特别是在对宏基因组测序数据的分析中。

结果

在这项工作中,我们提出了一种新颖的 SPAligner(圣彼得堡比对器)工具,用于将长距离分歧的核苷酸和氨基酸序列比对到组装图上。我们证明 SPAligner 是一种有效的解决方案,可将第三代测序读段映射到各种复杂程度的组装图上,还展示了它如何帮助识别复杂宏基因组数据集中的已知基因。

结论

我们的工作将有助于加速基于图的方法在解决序列到基因组组装比对问题方面的发展。SPAligner 作为 SPAdes 工具库的一部分实现,并可在 Github 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/061b/7379835/3d733467fc7d/12859_2020_3590_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/061b/7379835/21134975ee5e/12859_2020_3590_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/061b/7379835/3d733467fc7d/12859_2020_3590_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/061b/7379835/21134975ee5e/12859_2020_3590_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/061b/7379835/3d733467fc7d/12859_2020_3590_Fig2_HTML.jpg

相似文献

1
SPAligner: alignment of long diverged molecular sequences to assembly graphs.SPAligner:将长距离分化的分子序列比对到组装图谱上。
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):306. doi: 10.1186/s12859-020-03590-7.
2
Chaining for accurate alignment of erroneous long reads to acyclic variation graphs.基于无环变异图的错误长读精确比对链。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad460.
3
Haplotype-aware sequence alignment to pangenome graphs.基于单倍型感知的序列比对到泛基因组图谱。
Genome Res. 2024 Oct 11;34(9):1265-1275. doi: 10.1101/gr.279143.124.
4
GraphAligner: rapid and versatile sequence-to-graph alignment.GraphAligner:快速且通用的序列到图的对齐方法。
Genome Biol. 2020 Sep 24;21(1):253. doi: 10.1186/s13059-020-02157-2.
5
NovoGraph: Human genome graph construction from multiple long-read assemblies.NovoGraph:基于多个长读长组装构建人类基因组图谱。
F1000Res. 2018 Sep 3;7:1391. doi: 10.12688/f1000research.15895.2. eCollection 2018.
6
Fast and SNP-aware short read alignment with SALT.基于 SALT 的快速 SNP 感知短读序列比对。
BMC Bioinformatics. 2021 Aug 25;22(Suppl 9):172. doi: 10.1186/s12859-021-04088-6.
7
SRPRISM (Single Read Paired Read Indel Substitution Minimizer): an efficient aligner for assemblies with explicit guarantees.SRPRISM(单读配对读插入缺失替换最小化器):具有明确保证的组装的高效对齐器。
Gigascience. 2020 Apr 1;9(4). doi: 10.1093/gigascience/giaa023.
8
Bit-parallel sequence-to-graph alignment.位并行序列到图的对齐。
Bioinformatics. 2019 Oct 1;35(19):3599-3607. doi: 10.1093/bioinformatics/btz162.
9
Detection of simple and complex de novo mutations with multiple reference sequences.检测具有多个参考序列的简单和复杂从头突变。
Genome Res. 2020 Aug;30(8):1154-1169. doi: 10.1101/gr.255505.119. Epub 2020 Aug 19.
10
Aligning optical maps to de Bruijn graphs.将光学图谱比对到 De Bruijn 图上。
Bioinformatics. 2019 Sep 15;35(18):3250-3256. doi: 10.1093/bioinformatics/btz069.

引用本文的文献

1
Plant graph-based pangenomics: techniques, applications, and challenges.基于植物图谱的泛基因组学:技术、应用与挑战。
aBIOTECH. 2025 Mar 28;6(2):361-376. doi: 10.1007/s42994-025-00206-7. eCollection 2025 Jun.
2
A survey of sequence-to-graph mapping algorithms in the pangenome era.泛基因组时代序列到图谱映射算法综述。
Genome Biol. 2025 May 22;26(1):138. doi: 10.1186/s13059-025-03606-6.
3
Label-guided seed-chain-extend alignment on annotated De Bruijn graphs.带标签的种子链扩展对齐标注的 De Bruijn 图。

本文引用的文献

1
GraphAligner: rapid and versatile sequence-to-graph alignment.GraphAligner:快速且通用的序列到图的对齐方法。
Genome Biol. 2020 Sep 24;21(1):253. doi: 10.1186/s13059-020-02157-2.
2
Bit-parallel sequence-to-graph alignment.位并行序列到图的对齐。
Bioinformatics. 2019 Oct 1;35(19):3599-3607. doi: 10.1093/bioinformatics/btz162.
3
Sequence Alignment on Directed Graphs.有向图上的序列比对
Bioinformatics. 2024 Jun 28;40(Suppl 1):i337-i346. doi: 10.1093/bioinformatics/btae226.
4
Co-linear chaining on pangenome graphs.泛基因组图谱上的共线性连锁
Algorithms Mol Biol. 2024 Jan 27;19(1):4. doi: 10.1186/s13015-024-00250-w.
5
Pan-genome de Bruijn graph using the bidirectional FM-index.基于双向 FM-index 的泛基因组 de Bruijn 图
BMC Bioinformatics. 2023 Oct 26;24(1):400. doi: 10.1186/s12859-023-05531-6.
6
Chaining for accurate alignment of erroneous long reads to acyclic variation graphs.基于无环变异图的错误长读精确比对链。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad460.
7
From the reference human genome to human pangenome: Premise, promise and challenge.从参考人类基因组到人类泛基因组:前提、前景与挑战。
Front Genet. 2022 Nov 10;13:1042550. doi: 10.3389/fgene.2022.1042550. eCollection 2022.
8
BinSPreader: Refine binning results for fuller MAG reconstruction.BinSPreader:优化分箱结果以实现更完整的宏基因组组装基因组(MAG)重建。
iScience. 2022 Jul 19;25(8):104770. doi: 10.1016/j.isci.2022.104770. eCollection 2022 Aug 19.
9
Population-scale genotyping of structural variation in the era of long-read sequencing.长读长测序时代结构变异的群体规模基因分型
Comput Struct Biotechnol J. 2022 May 27;20:2639-2647. doi: 10.1016/j.csbj.2022.05.047. eCollection 2022.
10
The Human Pangenome Project: a global resource to map genomic diversity.人类泛基因组计划:绘制基因组多样性图谱的全球资源。
Nature. 2022 Apr;604(7906):437-446. doi: 10.1038/s41586-022-04601-8. Epub 2022 Apr 20.
J Comput Biol. 2019 Jan;26(1):53-67. doi: 10.1089/cmb.2017.0264. Epub 2018 Sep 8.
4
BrownieAligner: accurate alignment of Illumina sequencing data to de Bruijn graphs.BrownieAligner:Illumina 测序数据到 de Bruijn 图的精确比对。
BMC Bioinformatics. 2018 Sep 4;19(1):311. doi: 10.1186/s12859-018-2319-7.
5
Variation graph toolkit improves read mapping by representing genetic variation in the reference.变异图谱工具包通过表示参考中的遗传变异来提高读映射质量。
Nat Biotechnol. 2018 Oct;36(9):875-879. doi: 10.1038/nbt.4227. Epub 2018 Aug 20.
6
A graph-based approach to diploid genome assembly.基于图的二倍体基因组组装方法。
Bioinformatics. 2018 Jul 1;34(13):i105-i114. doi: 10.1093/bioinformatics/bty279.
7
Accurate detection of complex structural variations using single-molecule sequencing.利用单分子测序技术准确检测复杂结构变异。
Nat Methods. 2018 Jun;15(6):461-468. doi: 10.1038/s41592-018-0001-7. Epub 2018 Apr 30.
8
Genome-resolved metagenomics identifies genetic mobility, metabolic interactions, and unexpected diversity in perchlorate-reducing communities.基因组解析宏基因组学鉴定了高氯酸盐还原群落中的遗传可移动性、代谢相互作用和意外多样性。
ISME J. 2018 Jun;12(6):1568-1581. doi: 10.1038/s41396-018-0081-5. Epub 2018 Feb 23.
9
MUMmer4: A fast and versatile genome alignment system.MUMmer4:一种快速且通用的基因组比对系统。
PLoS Comput Biol. 2018 Jan 26;14(1):e1005944. doi: 10.1371/journal.pcbi.1005944. eCollection 2018 Jan.
10
Characterization of Metagenomes in Urban Aquatic Compartments Reveals High Prevalence of Clinically Relevant Antibiotic Resistance Genes in Wastewaters.城市水体环境中宏基因组的特征分析揭示了废水中临床相关抗生素抗性基因的高流行率。
Front Microbiol. 2017 Nov 16;8:2200. doi: 10.3389/fmicb.2017.02200. eCollection 2017.