• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

最大化增加 Duo 保留的支架填充的算法和难度。

Algorithms and Hardness for Scaffold Filling to Maximize Increased Duo-Preservations.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2071-2079. doi: 10.1109/TCBB.2021.3083896. Epub 2022 Aug 8.

DOI:10.1109/TCBB.2021.3083896
PMID:34038366
Abstract

Scaffold filling is a critical step in DNA assembly. Its basic task is to fill the missing genes (fragments) into an incomplete genome (scaffold) to make it similar to the reference genome. There have been a lot of work under distinct measurements in the literature of genome comparison. For genomes with gene duplications, common string partition reveals the similarity more precisely, since it constructs a one-to-one correspondence between the same segments in the two genomes. In this paper, we adopt duo-preservation as the measurement, which is the complement of common string partition, i.e., the number of duo-preservations added to the number of common strings is exactly the length of a genome. Towards a proper scaffold filling, we just focus on the increased duo-preservations. This problem is called scaffold filling to maximize increased duo-preservations (abbr. SF-MIDP). We show that SF-MIDP is solvable in linear time for a simple version where all the genes of the scaffold are matched in a block-matching, but MAX SNP-complete for the general version, and cannot be approximated within [Formula: see text]. Moreover, we present a basic approximation algorithm of factor 2, by which the optimal solution can be described in a new way, and then, improve the approximation factor to [Formula: see text] via a greedy method.

摘要

支架填充是 DNA 组装的关键步骤。它的基本任务是将缺失的基因(片段)填充到不完整的基因组(支架)中,使其与参考基因组相似。在基因组比较的文献中有很多不同的方法进行这项工作。对于具有基因重复的基因组,常见的字符串划分更精确地揭示了相似性,因为它在两个基因组中的相同片段之间建立了一对一的对应关系。在本文中,我们采用二元保存作为度量标准,它是常见字符串划分的补集,即添加的二元保存数与公共字符串数之和恰好是基因组的长度。为了进行适当的支架填充,我们只关注增加的二元保存数。这个问题被称为最大化增加的二元保存数的支架填充(缩写为 SF-MIDP)。我们表明,对于一个简单的版本,其中所有支架的基因都在块匹配中匹配,SF-MIDP 可以在线性时间内求解,但对于一般版本,它是 MAX SNP 完全的,并且不能在 [Formula: see text] 内逼近。此外,我们提出了一个基本的逼近算法,其逼近因子为 2,通过该算法可以以一种新的方式描述最优解,然后通过贪婪方法将逼近因子提高到 [Formula: see text]。

相似文献

1
Algorithms and Hardness for Scaffold Filling to Maximize Increased Duo-Preservations.最大化增加 Duo 保留的支架填充的算法和难度。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2071-2079. doi: 10.1109/TCBB.2021.3083896. Epub 2022 Aug 8.
2
An improved approximation algorithm for scaffold filling to maximize the common adjacencies.一种改进的支架填充近似算法,以最大化公共邻接。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):905-13. doi: 10.1109/TCBB.2013.100.
3
A 2-approximation algorithm for the contig-based genomic scaffold filling problem.
J Bioinform Comput Biol. 2018 Dec;16(6):1850022. doi: 10.1142/S0219720018500221.
4
Scaffold filling under the breakpoint and related distances.支架填充在断点下和相关距离。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1220-9. doi: 10.1109/TCBB.2012.57.
5
Genome Rearrangement Distance with Reversals, Transpositions, and Indels.带反转、转位和插入缺失的基因组重排距离
J Comput Biol. 2021 Mar;28(3):235-247. doi: 10.1089/cmb.2020.0121. Epub 2020 Oct 20.
6
Approaching the One-Sided Exemplar Adjacency Number Problem.逼近单边典范邻接数问题。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1946-1954. doi: 10.1109/TCBB.2019.2913834. Epub 2020 Dec 8.
7
Flanked Block-Interchange Distance on Strings.弦上的交错块间隔。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Mar-Apr;21(2):301-311. doi: 10.1109/TCBB.2024.3351440. Epub 2024 Apr 8.
8
Greedy Hypervolume Subset Selection in Low Dimensions.低维空间中的贪婪超体积子集选择
Evol Comput. 2016 Fall;24(3):521-44. doi: 10.1162/EVCO_a_00188. Epub 2016 Jun 15.
9
On the rank-distance median of 3 permutations.关于 3 个排列的秩距中值。
BMC Bioinformatics. 2018 May 8;19(Suppl 6):142. doi: 10.1186/s12859-018-2131-4.
10
Encoded expansion: an efficient algorithm to discover identical string motifs.编码扩展:一种发现相同字符串基序的高效算法。
PLoS One. 2014 May 28;9(5):e95148. doi: 10.1371/journal.pone.0095148. eCollection 2014.