• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用所有解决方案共有的部分解决方案安全填补空白。

Safely Filling Gaps with Partial Solutions Common to All Solutions.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):617-626. doi: 10.1109/TCBB.2017.2785831. Epub 2018 Jan 15.

DOI:10.1109/TCBB.2017.2785831
PMID:29994355
Abstract

Gap filling has emerged as a natural sub-problem of many de novo genome assembly projects. The gap filling problem generally asks for an $s$s-$t$t path in an assembly graph whose length matches the gap length estimate. Several methods have addressed it, but only few have focused on strategies for dealing with multiple gap filling solutions and for guaranteeing reliable results. Such strategies include reporting only unique solutions, or exhaustively enumerating all filling solutions and heuristically creating their consensus. Our main contribution is a new method for reliable gap filling: filling gaps with those sub-paths common to all gap filling solutions. We call these partial solutions safe, following the framework of (Tomescu and Medvedev, RECOMB 2016). We give an efficient safe algorithm running in $O(dm)$O(dm) time and space, where $d$d is the gap length estimate and $m$m is the number of edges of the assembly graph. To show the benefits of this method, we implemented this algorithm for the problem of filling gaps in scaffolds. Our experimental results on bacterial and on conservative human assemblies show that, on average, our method can retrieve over 73 percent more safe and correct bases as compared to previous methods, with a similar precision.

摘要

缺口填补已成为许多从头基因组组装项目的自然子问题。缺口填补问题通常要求在组装图中找到一条长度与缺口长度估计值匹配的 $s$-$t$t 路径。已经有几种方法解决了这个问题,但只有少数方法专注于处理多个缺口填补解决方案并保证可靠结果的策略。这些策略包括只报告唯一的解决方案,或者详尽地枚举所有填补解决方案,并启发式地创建它们的共识。我们的主要贡献是一种可靠的缺口填补新方法:用所有缺口填补解决方案共有的子路径来填补缺口。根据 (Tomescu 和 Medvedev,RECOMB 2016) 的框架,我们称这些部分解决方案为安全的。我们给出了一个在 $O(dm)$O(dm) 时间和空间复杂度下运行的高效安全算法,其中 $d$d 是缺口长度估计值,$m$m 是组装图的边数。为了展示这种方法的好处,我们将此算法实现为填补支架中的缺口问题。我们在细菌和保守人类基因组组装上的实验结果表明,与以前的方法相比,我们的方法平均可以检索到超过 73%的更多安全和正确的碱基,并且具有相似的精度。

相似文献

1
Safely Filling Gaps with Partial Solutions Common to All Solutions.用所有解决方案共有的部分解决方案安全填补空白。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):617-626. doi: 10.1109/TCBB.2017.2785831. Epub 2018 Jan 15.
2
Gap Filling as Exact Path Length Problem.间隙填充作为精确路径长度问题。
J Comput Biol. 2016 May;23(5):347-61. doi: 10.1089/cmb.2015.0197. Epub 2016 Mar 9.
3
GapPredict - A Language Model for Resolving Gaps in Draft Genome Assemblies.GapPredict - 一种用于解决基因组草图组装中缺口的语言模型。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2802-2808. doi: 10.1109/TCBB.2021.3109557. Epub 2021 Dec 8.
4
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
5
RFfiller: a robust and fast statistical algorithm for gap filling in draft genomes.RFfiller:一种用于填补基因组草图缺口的强大而快速的统计算法。
PeerJ. 2022 Oct 14;10:e14186. doi: 10.7717/peerj.14186. eCollection 2022.
6
GAPPadder: a sensitive approach for closing gaps on draft genomes with short sequence reads.GAPPadder:一种使用短序列读长来闭合草图基因组缺口的灵敏方法。
BMC Genomics. 2019 Jun 6;20(Suppl 5):426. doi: 10.1186/s12864-019-5703-4.
7
Filling gaps of genome scaffolds via probabilistic searching optical maps against assembly graph.基于组装图的概率搜索光学图谱填补基因组支架的缺口。
BMC Bioinformatics. 2021 Oct 30;22(1):533. doi: 10.1186/s12859-021-04448-2.
8
Multiple sequence assembly from reads alignable to a common reference genome.基于可比对至公共参考基因组的读长进行多重序列组装。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1283-95. doi: 10.1109/TCBB.2010.107.
9
Assembly of long, error-prone reads using repeat graphs.使用重复图组装长的、易错的读取。
Nat Biotechnol. 2019 May;37(5):540-546. doi: 10.1038/s41587-019-0072-8. Epub 2019 Apr 1.
10
A safe and complete algorithm for metagenomic assembly.
Algorithms Mol Biol. 2018 Feb 7;13:3. doi: 10.1186/s13015-018-0122-7. eCollection 2018.

引用本文的文献

1
Variant genotyping with gap filling.带缺口填充的变异基因分型
PLoS One. 2017 Sep 8;12(9):e0184608. doi: 10.1371/journal.pone.0184608. eCollection 2017.