Wang Lusheng, Zhao Jianyun
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong, Peoples Republic of China.
Bioinformatics. 2003 Nov 22;19(17):2237-45. doi: 10.1093/bioinformatics/btg305.
Computing the similarity between two ordered trees has applications in RNA secondary structure comparison, genetics and chemical structure analysis. Alignment of tree is one of the proposed measures. Similar to pair-wise sequence comparison, there is often disagreement about how to weight matches, mismatches, indels and gaps when we compare two trees. For sequence comparison, the parametric sequence alignment tools have been developed. The users are allowed to see explicitly and completely the effect of parameter choices on the optimal sequence alignments. A similar tool for aligning two ordered trees is required in practice.
We develop a parametric tool for aligning two ordered trees that allow users to see the effect of parameter choices on the optimal alignment of trees. Our contributions include: (1) develop a parametric tool for aligning two ordered trees; (2) design an efficient algorithm for aligning two ordered trees with gap penalties that runs in O(n(2)deg(2)) time, where n is the number of nodes in the trees and deg is the degree of the trees; and (3) reduce the space of the algorithm from O(n(2)deg(2)) to O(n log n. deg(2)).
The software is available at http://www.cs.cityu.edu.hk/~lwang/software/ParaTree
计算两个有序树之间的相似度在RNA二级结构比较、遗传学和化学结构分析中具有应用价值。树的比对是一种已提出的度量方法。与成对序列比较类似,在比较两棵树时,对于如何权衡匹配、错配、插入和间隙,常常存在分歧。对于序列比较,已经开发了参数化序列比对工具。用户能够明确且全面地看到参数选择对最优序列比对的影响。实际上需要一个类似的用于比对两个有序树的工具。
我们开发了一个用于比对两个有序树的参数化工具,该工具允许用户看到参数选择对树的最优比对的影响。我们的贡献包括:(1)开发一个用于比对两个有序树的参数化工具;(2)设计一种有效的算法,用于在存在间隙罚分的情况下比对两个有序树,其运行时间为O(n(2)deg(2)),其中n是树中的节点数,deg是树的度;以及(3)将算法的空间从O(n(2)deg(2))减少到O(n log n. deg(2))。