• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Flowtigs:装配图流分解中的安全性

Flowtigs: Safety in flow decompositions for assembly graphs.

作者信息

Sena Francisco, Ingervo Eliel, Khan Shahbaz, Prjibelski Andrey, Schmidt Sebastian, Tomescu Alexandru

机构信息

University of Helsinki, Helsinki, Finland.

Indian Institute of Technology Roorkee, Roorkee, India.

出版信息

iScience. 2024 Oct 25;27(12):111208. doi: 10.1016/j.isci.2024.111208. eCollection 2024 Dec 20.

DOI:10.1016/j.isci.2024.111208
PMID:39759024
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11700653/
Abstract

A of a network flow is a set of weighted walks whose superposition equals the flow. In this article, we give a simple and linear-time-verifiable complete characterization () of walks that are in such general flow decompositions, i.e., that are subwalks of any possible flow decomposition. We provide an ()-time algorithm that identifies all maximal flowtigs and represents them inside a compact structure. On the practical side, we study flowtigs in the use-case of metagenomic assembly. By using the species abundances as flow values of the metagenomic assembly graph, we can model the possible assembly solutions as flow decompositions into weighted closed walks. On simulated data, compared to reporting unitigs or maximal safe walks based only on the graph structure, reporting flowtigs results in a notably more contiguous assembly. On real data, we frame flowtigs as a heuristic and provide an algorithm that is guided by this heuristic.

摘要

网络流的一个流tig是一组加权路径,其叠加等于该流。在本文中,我们给出了在这种一般流分解中有效的路径的简单且可线性时间验证的完整刻画(),即任何可能的流分解的子路径。我们提供了一个()时间算法,该算法可识别所有最大流tig并将它们表示在一个紧凑结构中。在实际方面,我们在宏基因组组装的用例中研究流tig。通过将物种丰度用作宏基因组组装图的流值,我们可以将可能的组装解决方案建模为加权闭路径的流分解。在模拟数据上,与仅基于图结构报告单条序列或最大安全路径相比,报告流tig会产生明显更连续的组装。在真实数据上,我们将流tig构建为一种启发式方法,并提供一种受此启发式方法指导的算法。

相似文献

1
Flowtigs: Safety in flow decompositions for assembly graphs.Flowtigs:装配图流分解中的安全性
iScience. 2024 Oct 25;27(12):111208. doi: 10.1016/j.isci.2024.111208. eCollection 2024 Dec 20.
2
Improving RNA Assembly via Safety and Completeness in Flow Decompositions.通过流分解中的安全性和完整性提高 RNA 组装
J Comput Biol. 2022 Dec;29(12):1270-1287. doi: 10.1089/cmb.2022.0261. Epub 2022 Oct 25.
3
A safe and complete algorithm for metagenomic assembly.
Algorithms Mol Biol. 2018 Feb 7;13:3. doi: 10.1186/s13015-018-0122-7. eCollection 2018.
4
AN EFFICIENT ALGORITHM FOR CHINESE POSTMAN WALK ON BI-DIRECTED DE BRUIJN GRAPHS.一种在双向德布鲁因图上的中国邮路问题的高效算法。
Discrete Math Algorithms Appl. 2010;1:184-196. doi: 10.1007/978-3-642-17458-2_16.
5
Flow Decomposition With Subpath Constraints.具有子路径约束的流分解
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):360-370. doi: 10.1109/TCBB.2022.3147697. Epub 2023 Feb 3.
6
Safety in Multi-Assembly via Paths Appearing in All Path Covers of a DAG.通过有向无环图(DAG)的所有路径覆盖中出现的路径实现多组件中的安全性。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3673-3684. doi: 10.1109/TCBB.2021.3131203. Epub 2022 Dec 8.
7
Read mapping on de Bruijn graphs.在德布鲁因图上进行读段映射。
BMC Bioinformatics. 2016 Jun 16;17(1):237. doi: 10.1186/s12859-016-1103-9.
8
A memory-efficient data structure representing exact-match overlap graphs with application for next-generation DNA assembly.一种内存效率高的数据结构,用于表示精确匹配的重叠图,适用于下一代 DNA 组装。
Bioinformatics. 2011 Jul 15;27(14):1901-7. doi: 10.1093/bioinformatics/btr321. Epub 2011 Jun 2.
9
Quantum algorithm for de novo DNA sequence assembly based on quantum walks on graphs.基于图上量子游走的从头DNA序列组装量子算法。
Biosystems. 2023 Nov;233:105037. doi: 10.1016/j.biosystems.2023.105037. Epub 2023 Sep 19.
10
Unitig level assembly graph based metagenome-assembled genome refiner (UGMAGrefiner): A tool to increase completeness and resolution of metagenome-assembled genomes.基于单条重叠群水平组装图的宏基因组组装基因组优化器(UGMAGrefiner):一种提高宏基因组组装基因组完整性和分辨率的工具。
Comput Struct Biotechnol J. 2023 Mar 21;21:2394-2404. doi: 10.1016/j.csbj.2023.03.030. eCollection 2023.

本文引用的文献

1
High-quality metagenome assembly from long accurate reads with metaMDBG.使用 metaMDBG 从长而准确的读取中进行高质量的宏基因组组装。
Nat Biotechnol. 2024 Sep;42(9):1378-1383. doi: 10.1038/s41587-023-01983-6. Epub 2024 Jan 2.
2
Coverage-preserving sparsification of overlap graphs for long-read assembly.重叠图的覆盖保持稀疏化用于长读长组装。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad124.
3
Telomere-to-telomere assembly of diploid chromosomes with Verkko.利用 Verkko 进行二倍体染色体的端粒到端粒组装。
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.
4
MetaGT: A pipeline for assembly of metatranscriptomes with the aid of metagenomic data.MetaGT:一种借助宏基因组数据组装宏转录组的流程。
Front Microbiol. 2022 Oct 28;13:981458. doi: 10.3389/fmicb.2022.981458. eCollection 2022.
5
Assembler artifacts include misassembly because of unsafe unitigs and underassembly because of bidirected graphs.组装体伪影包括由于不安全的单元而导致的组装错误,以及由于双向图而导致的组装不足。
Genome Res. 2022 Sep 27;32(9):1746-1753. doi: 10.1101/gr.276601.122.
6
Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing.牛津纳米孔 R10.4 长读测序能够从纯培养物和宏基因组中生成近乎完成的细菌基因组,而无需进行短读测序或参考序列优化。
Nat Methods. 2022 Jul;19(7):823-826. doi: 10.1038/s41592-022-01539-7. Epub 2022 Jul 4.
7
Metagenome assembly of high-fidelity long reads with hifiasm-meta.利用 hifiasm-meta 进行高保真长读长的宏基因组组装。
Nat Methods. 2022 Jun;19(6):671-674. doi: 10.1038/s41592-022-01478-3. Epub 2022 May 9.
8
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
9
Flow Decomposition With Subpath Constraints.具有子路径约束的流分解
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):360-370. doi: 10.1109/TCBB.2022.3147697. Epub 2023 Feb 3.
10
Deriving Ranges of Optimal Estimated Transcript Expression due to Nonidentifiability.由于不可识别性导致的最优转录本表达范围的推导。
J Comput Biol. 2022 Feb;29(2):121-139. doi: 10.1089/cmb.2021.0444. Epub 2022 Jan 17.