Suppr超能文献

RNA树的无根无序同胚子树比对

Unrooted unordered homeomorphic subtree alignment of RNA trees.

作者信息

Milo Nimrod, Zakov Shay, Katzenelson Erez, Bachmat Eitan, Dinitz Yefim, Ziv-Ukelson Michal

机构信息

Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel.

Department of Computer Science and Engineering, UC San Diego, La Jolla, CA, USA.

出版信息

Algorithms Mol Biol. 2013 Apr 16;8(1):13. doi: 10.1186/1748-7188-8-13.

Abstract

: We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(nTnS + min(dT,dS)LTLS) time complexity, where nT,LT and dT are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying dT ≤ LT ≤ nT), and similarly for nS,LS and dS with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem.In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n3 + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m.We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA family members, exposing new similarities which could not be detected by the traditional rooted ordered alignment approaches. The results demonstrate that our approach can be used to expose structural similarity between some RNAs with higher sensitivity than the traditional rooted ordered alignment approaches. Source code and web-interface for our tool can be found in http://www.cs.bgu.ac.il/\~negevcb/FRUUT.

摘要

我们将当前一些用于RNA树比对的方法进行了推广,这些方法传统上局限于有序有根映射,现在也考虑无序无根映射。我们定义了同构子树比对问题(HSA),并提出了一种新算法,该算法适用于多种模式,包括全局或局部、有序或无序、有根或无根树比对。我们的算法推广了先前的算法,这些算法要么以非对称方式解决问题,要么局限于有根和/或有序情况。在此重点关注最一般的无序无根情况,我们表明对于输入树T和S,我们的算法具有O(nTnS + min(dT, dS)LTLS)的时间复杂度,其中nT、LT和dT分别是T中的节点数、叶数和最大节点度(满足dT ≤ LT ≤ nT),对于树S的nS、LS和dS也类似。这改进了先前针对该问题不太一般变体的算法的时间复杂度。为了获得HSA的这个时间界限,我们针对最小成本二分匹配问题(MCM)的一个广义变体以及该问题的两个衍生问题,即全腔MCM和全对全腔MCM,开发了新算法。对于大小分别为n和m的两个输入集,其中n ≤ m,MCM及其两个腔衍生问题都能在O(n3 + nm)时间内解决,无需使用优先队列(如斐波那契堆)或其他复杂数据结构。这给出了全对全腔MCM的首个三次时间算法,并在n ≪ m的不平衡情况下改进了MCM和全腔MCM问题的运行时间。我们将该算法(在上述所有模式下)实现为一个图形软件工具,该工具计算并显示作为输入给出的RNA二级结构之间的相似性,并将其用于一个初步实验,在该实验中我们对RNA酶P和锤头状RNA家族成员进行了全对全家族间两两比对,揭示了传统有根有序比对方法无法检测到的新相似性。结果表明,我们的方法可用于以比传统有根有序比对方法更高的灵敏度揭示一些RNA之间的结构相似性。我们工具的源代码和网络界面可在http://www.cs.bgu.ac.il/\~negevcb/FRUUT找到。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验