• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PangeBlocks:通过最大块实现泛基因组图的定制构建。

PangeBlocks: customized construction of pangenome graphs via maximal blocks.

机构信息

Department of Informatics, Systems, and Communications, University of Milano - Bicocca, Viale Sarca, 20126, Milano, Italy.

Department of Applied Informatics, Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Mlynská dolina F1, Bratislava, 84248, Slovakia.

出版信息

BMC Bioinformatics. 2024 Nov 4;25(1):344. doi: 10.1186/s12859-024-05958-5.

DOI:10.1186/s12859-024-05958-5
PMID:39497039
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11533710/
Abstract

BACKGROUND

The construction of a pangenome graph is a fundamental task in pangenomics. A natural theoretical question is how to formalize the computational problem of building an optimal pangenome graph, making explicit the underlying optimization criterion and the set of feasible solutions. Current approaches build a pangenome graph with some heuristics, without assuming some explicit optimization criteria. Thus it is unclear how a specific optimization criterion affects the graph topology and downstream analysis, like read mapping and variant calling.

RESULTS

In this paper, by leveraging the notion of maximal block in a Multiple Sequence Alignment (MSA), we reframe the pangenome graph construction problem as an exact cover problem on blocks called Minimum Weighted Block Cover (MWBC). Then we propose an Integer Linear Programming (ILP) formulation for the MWBC problem that allows us to study the most natural objective functions for building a graph. We provide an implementation of the ILP approach for solving the MWBC and we evaluate it on SARS-CoV-2 complete genomes, showing how different objective functions lead to pangenome graphs that have different properties, hinting that the specific downstream task can drive the graph construction phase.

CONCLUSION

We show that a customized construction of a pangenome graph based on selecting objective functions has a direct impact on the resulting graphs. In particular, our formalization of the MWBC problem, based on finding an optimal subset of blocks covering an MSA, paves the way to novel practical approaches to graph representations of an MSA where the user can guide the construction.

摘要

背景

泛基因组图的构建是泛基因组学的基本任务。一个自然的理论问题是如何形式化构建最优泛基因组图的计算问题,明确优化标准和可行解集。当前的方法使用一些启发式算法构建泛基因组图,而没有假设一些显式的优化标准。因此,不清楚特定的优化标准如何影响图拓扑结构和下游分析,如读取映射和变异调用。

结果

在本文中,通过利用多重序列比对(MSA)中最大块的概念,我们将泛基因组图构建问题重新表述为块上的精确覆盖问题,称为最小加权块覆盖(MWBC)。然后,我们提出了一个用于 MWBC 问题的整数线性规划(ILP)公式,允许我们研究构建图的最自然的目标函数。我们提供了一种用于解决 MWBC 的 ILP 方法的实现,并在 SARS-CoV-2 完整基因组上进行了评估,展示了不同的目标函数如何导致具有不同性质的泛基因组图,暗示特定的下游任务可以驱动图构建阶段。

结论

我们表明,基于选择目标函数定制构建泛基因组图会直接影响生成的图。特别是,我们基于找到最佳块子集来覆盖 MSA 的 MWBC 问题的形式化,为 MSA 的图表示开辟了新的实用方法,用户可以在其中指导构建。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/792b79adfaa1/12859_2024_5958_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/a1bc65709f66/12859_2024_5958_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/92995e82083b/12859_2024_5958_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/d1207f088d0e/12859_2024_5958_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/b9c2704bdfc5/12859_2024_5958_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/8c567e8dacc6/12859_2024_5958_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/1674734afd2f/12859_2024_5958_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/6aa730dccaa8/12859_2024_5958_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/bb6fc5d345b2/12859_2024_5958_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/e2ed08abb32c/12859_2024_5958_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/113d1770dce7/12859_2024_5958_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/2357f34b7e83/12859_2024_5958_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/2baf7f9a583c/12859_2024_5958_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/002d9d2fde1e/12859_2024_5958_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/792b79adfaa1/12859_2024_5958_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/a1bc65709f66/12859_2024_5958_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/92995e82083b/12859_2024_5958_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/d1207f088d0e/12859_2024_5958_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/b9c2704bdfc5/12859_2024_5958_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/8c567e8dacc6/12859_2024_5958_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/1674734afd2f/12859_2024_5958_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/6aa730dccaa8/12859_2024_5958_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/bb6fc5d345b2/12859_2024_5958_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/e2ed08abb32c/12859_2024_5958_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/113d1770dce7/12859_2024_5958_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/2357f34b7e83/12859_2024_5958_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/2baf7f9a583c/12859_2024_5958_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/002d9d2fde1e/12859_2024_5958_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbc8/11533710/792b79adfaa1/12859_2024_5958_Fig14_HTML.jpg

相似文献

1
PangeBlocks: customized construction of pangenome graphs via maximal blocks.PangeBlocks:通过最大块实现泛基因组图的定制构建。
BMC Bioinformatics. 2024 Nov 4;25(1):344. doi: 10.1186/s12859-024-05958-5.
2
Gap-Sensitive Colinear Chaining Algorithms for Acyclic Pangenome Graphs.循环无亲缘基因组图谱的缝隙敏感共线性链接算法。
J Comput Biol. 2023 Nov;30(11):1182-1197. doi: 10.1089/cmb.2023.0186. Epub 2023 Oct 30.
3
Unbiased pangenome graphs.无偏泛基因组图。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac743.
4
Pangenome graphs in infectious disease: a comprehensive genetic variation analysis of leveraging Oxford Nanopore long reads.传染病中的泛基因组图谱:利用牛津纳米孔长读长进行的全面遗传变异分析
Front Genet. 2023 Aug 10;14:1225248. doi: 10.3389/fgene.2023.1225248. eCollection 2023.
5
Co-linear chaining on pangenome graphs.泛基因组图谱上的共线性连锁
Algorithms Mol Biol. 2024 Jan 27;19(1):4. doi: 10.1186/s13015-024-00250-w.
6
Chaining for accurate alignment of erroneous long reads to acyclic variation graphs.基于无环变异图的错误长读精确比对链。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad460.
7
Haplotype-aware sequence alignment to pangenome graphs.基于单倍型感知的序列比对到泛基因组图谱。
Genome Res. 2024 Oct 11;34(9):1265-1275. doi: 10.1101/gr.279143.124.
8
Building pangenome graphs.构建泛基因组图谱。
Nat Methods. 2024 Nov;21(11):2008-2012. doi: 10.1038/s41592-024-02430-3. Epub 2024 Oct 21.
9
Comparing methods for constructing and representing human pangenome graphs.比较构建和表示人类泛基因组图的方法。
Genome Biol. 2023 Nov 30;24(1):274. doi: 10.1186/s13059-023-03098-2.
10
A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.作物泛基因组开发的分步指南:以紫花苜蓿(Medicago sativa)为例。
BMC Genomics. 2024 Oct 31;25(1):1022. doi: 10.1186/s12864-024-10931-w.

本文引用的文献

1
Building pangenome graphs.构建泛基因组图谱。
Nat Methods. 2024 Nov;21(11):2008-2012. doi: 10.1038/s41592-024-02430-3. Epub 2024 Oct 21.
2
Graph construction method impacts variation representation and analyses in a bovine super-pangenome.图构建方法影响牛超级泛基因组中的变异表示和分析。
Genome Biol. 2023 May 22;24(1):124. doi: 10.1186/s13059-023-02969-y.
3
Pangenome graph construction from genome alignments with Minigraph-Cactus.基于 Minigraph-Cactus 的基因组比对构建泛基因组图谱。
Nat Biotechnol. 2024 Apr;42(4):663-673. doi: 10.1038/s41587-023-01793-w. Epub 2023 May 10.
4
Computational graph pangenomics: a tutorial on data structures and their applications.计算图泛基因组学:数据结构及其应用教程
Nat Comput. 2022 Mar;21(1):81-108. doi: 10.1007/s11047-022-09882-6. Epub 2022 Mar 4.
5
Unbiased pangenome graphs.无偏泛基因组图。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac743.
6
ODGI: understanding pangenome graphs.ODGI:理解泛基因组图谱。
Bioinformatics. 2022 Jun 27;38(13):3319-3326. doi: 10.1093/bioinformatics/btac308.
7
Pangenomics in crop improvement-from coding structural variations to finding regulatory variants with pangenome graphs.作物改良中的泛基因组学——从编码结构变异到利用泛基因组图谱寻找调控变异
Plant Genome. 2022 Mar;15(1):e20177. doi: 10.1002/tpg2.20177. Epub 2021 Dec 13.
8
Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs.潘多拉:基于参考图谱的核苷酸分辨率细菌泛基因组学。
Genome Biol. 2021 Sep 14;22(1):267. doi: 10.1186/s13059-021-02473-1.
9
Gramtools enables multiscale variation analysis with genome graphs.Gramtools 支持基于基因组图的多尺度变异分析。
Genome Biol. 2021 Sep 6;22(1):259. doi: 10.1186/s13059-021-02474-0.
10
The design and construction of reference pangenome graphs with minigraph.使用 Minigraph 设计和构建参考泛基因组图谱。
Genome Biol. 2020 Oct 16;21(1):265. doi: 10.1186/s13059-020-02168-z.