一种用于参考锚定基因块的双簇方法及其在基因组岛中的应用。

A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands.

作者信息

Benshahar Arnon, Chalifa-Caspi Vered, Hermelin Danny, Ziv-Ukelson Michal

机构信息

1 Department of Computer Science, Ben-Gurion University , Beer-Sheva, Israel .

2 National Institute for Biotechnology in the Negev, Ben-Gurion University , Beer-Sheva, Israel .

出版信息

J Comput Biol. 2018 Feb;25(2):214-235. doi: 10.1089/cmb.2017.0108. Epub 2017 Oct 13.

DOI:10.1089/cmb.2017.0108

PMID:29028176

Abstract

We formalize a new problem variant in gene-block discovery, denoted Reference-Anchored Gene Blocks (RAGB), given a query sequence Q of length n, representing the gene array of a DNA element, a window size bound d on the length of a substring of interest in Q, and a set of target gene sequences [Formula: see text]. Our objective is to identify gene blocks in [Formula: see text] that are centered in a subset q of co-localized genes from Q, and contain genomes from [Formula: see text] in which the corresponding orthologs of the genes from q are also co-localized. We cast RAGB as a variant of a (colored) biclique problem in bipartite graphs, and analyze its parameterized complexity, as well as the parameterized complexity of other related problems. We give an [Formula: see text] time algorithm for the uncolored variant of our biclique problem, where m is the number of areas of interest that are parsed from the target sequences, and n and d are as defined earlier. Our algorithm can be adapted to compute all maximal bicliques in the graph within the same time complexity, and to handle edge weights with a slight [Formula: see text] increase to its time complexity. For the colored version of the problem, our algorithm has a time complexity of [Formula: see text]. We implement the algorithm and exemplify its application to the data mining of proteobacterial gene blocks that are centered in predicted proteobacterial genomic islands, leading to the identification of putatively mobilized clusters of virulence, pathogenicity, and resistance genes.

摘要

在基因块发现中，我们形式化了一个新的问题变体，称为参考锚定基因块（RAGB）。给定一个长度为n的查询序列Q，它代表一个DNA元件的基因阵列，一个关于Q中感兴趣子串长度的窗口大小界限d，以及一组目标基因序列[公式：见原文]。我们的目标是在[公式：见原文]中识别基因块，这些基因块以Q中一组共定位基因的子集q为中心，并且包含来自[公式：见原文]的基因组，其中q中基因的相应直系同源基因也在这些基因组中共定位。我们将RAGB转化为二分图中（带颜色的）双团问题的一个变体，并分析其参数化复杂度以及其他相关问题的参数化复杂度。对于我们双团问题的无色变体，我们给出了一个[公式：见原文]时间复杂度的算法，其中m是从目标序列中解析出的感兴趣区域的数量，n和d如前文所定义。我们的算法可以在相同的时间复杂度内进行调整以计算图中的所有最大双团，并在时间复杂度略有[公式：见原文]增加的情况下处理边权重。对于该问题的带颜色版本，我们的算法时间复杂度为[公式：见原文]。我们实现了该算法，并举例说明了其在以预测的变形菌属基因组岛为中心的变形菌属基因块数据挖掘中的应用，从而识别出可能被转移的毒力、致病性和抗性基因簇。

相似文献

A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands.一种用于参考锚定基因块的双簇方法及其在基因组岛中的应用。

J Comput Biol. 2018 Feb;25(2):214-235. doi: 10.1089/cmb.2017.0108. Epub 2017 Oct 13.

(p,q)-biclique counting and enumeration for large sparse bipartite graphs.大型稀疏二分图的(p,q)双团计数与枚举

VLDB J. 2023 Mar 13:1-25. doi: 10.1007/s00778-023-00786-0.

New algorithms for structure informed genome rearrangement.用于结构信息基因组重排的新算法。

Algorithms Mol Biol. 2023 Dec 1;18(1):17. doi: 10.1186/s13015-023-00239-x.

Approximate search for known gene clusters in new genomes using PQ-trees.使用PQ树在新基因组中近似搜索已知基因簇。

Algorithms Mol Biol. 2021 Jul 9;16(1):16. doi: 10.1186/s13015-021-00190-9.

Encoded expansion: an efficient algorithm to discover identical string motifs.编码扩展：一种发现相同字符串基序的高效算法。

PLoS One. 2014 May 28;9(5):e95148. doi: 10.1371/journal.pone.0095148. eCollection 2014.

ET-Motif: Solving the Exact (l, d)-Planted Motif Problem Using Error Tree Structure.ET-基序：使用错误树结构解决精确的（l，d）植入基序问题

J Comput Biol. 2016 Jul;23(7):615-23. doi: 10.1089/cmb.2015.0238. Epub 2016 May 6.

Biclique: an R package for maximal biclique enumeration in bipartite graphs.双团：用于二分图中最大双团枚举的R语言包。

BMC Res Notes. 2020 Feb 21;13(1):88. doi: 10.1186/s13104-020-04955-0.

Parameterized runtime analyses of evolutionary algorithms for the planar euclidean traveling salesperson problem.针对平面欧几里得旅行商问题的进化算法的参数化运行时分析。

Evol Comput. 2014 Winter;22(4):595-628. doi: 10.1162/EVCO_a_00119.

Greedy Hypervolume Subset Selection in Low Dimensions.低维空间中的贪婪超体积子集选择

Evol Comput. 2016 Fall;24(3):521-44. doi: 10.1162/EVCO_a_00188. Epub 2016 Jun 15.

Quantum algorithm for multivariate polynomial interpolation.多元多项式插值的量子算法。

Proc Math Phys Eng Sci. 2018 Jan;474(2209):20170480. doi: 10.1098/rspa.2017.0480. Epub 2018 Jan 17.

一种用于参考锚定基因块的双簇方法及其在基因组岛中的应用。

A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands.

作者信息

Benshahar Arnon, Chalifa-Caspi Vered, Hermelin Danny, Ziv-Ukelson Michal

机构信息

1 Department of Computer Science, Ben-Gurion University , Beer-Sheva, Israel .

2 National Institute for Biotechnology in the Negev, Ben-Gurion University , Beer-Sheva, Israel .

出版信息

J Comput Biol. 2018 Feb;25(2):214-235. doi: 10.1089/cmb.2017.0108. Epub 2017 Oct 13.

DOI:10.1089/cmb.2017.0108

PMID:29028176

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种用于参考锚定基因块的双簇方法及其在基因组岛中的应用。

A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands.

作者信息

机构信息

出版信息

相似文献

一种用于参考锚定基因块的双簇方法及其在基因组岛中的应用。

A Biclique Approach to Reference-Anchored Gene Blocks and Its Applications to Genomic Islands.

作者信息

机构信息

出版信息

相似文献