The Mina & Everard Goodman Faculty of Life Sciences, Bar Ilan University, Ramat Gan, Israel.
Faculty of Engineering, Tel-Aviv University, Tel Aviv, Israel.
Sci Rep. 2022 Mar 28;12(1):5256. doi: 10.1038/s41598-022-08360-4.
A vectorial distance measure for trees is presented. Given two trees, we define a Tree-Alignment (T-Alignment). We T-align the trees from their centers outwards, starting from the root-branches, to make the next level as similar as possible. The algorithm is recursive; condition on the T-alignment of the root-branches we T-align the sub-branches, thereafter each T-alignment is conditioned on the previous one. We define a minimal T-alignment under a lexicographic order which follows the intuition that the differences between the two trees constitutes a vector. Given such a minimal T-alignment, the difference in the number of branches calculated at any level defines the entry of the distance vector at that level. We compare our algorithm to other well-known tree distance measures in the task of clustering sets of phylogenetic trees. We use the TreeSimGM simulator for generating stochastic phylogenetic trees. The vectorial tree distance (VTD) can successfully separate symmetric from asymmetric trees, and hierarchical from non-hierarchical trees. We also test the algorithm as a classifier of phylogenetic trees extracted from two members of the fungi kingdom, mushrooms and mildews, thus showimg that the algorithm can separate real world phylogenetic trees. The Matlab code can be accessed via: https://gitlab.com/avner.priel/vectorial-tree-distance .
提出了一种用于树的向量距离度量。对于两棵树,我们定义了树对齐(T-Alignment)。我们从根分支开始,从树的中心向外进行 T 对齐,以使下一级尽可能相似。算法是递归的;根据根分支的 T 对齐条件,我们对子分支进行 T 对齐,然后每个 T 对齐条件都依赖于前一个 T 对齐条件。我们在一个字典序下定义最小 T 对齐,这符合这样一种直觉,即两棵树之间的差异构成了一个向量。给定这样一个最小 T 对齐,在任何级别计算的分支数量差异定义了该距离向量在该级别上的条目。我们在聚类系统发育树集合的任务中,将我们的算法与其他著名的树距离度量进行了比较。我们使用 TreeSimGM 模拟器生成随机系统发育树。向量树距离(VTD)可以成功地区分对称树和非对称树,以及层次树和非层次树。我们还将该算法作为从真菌界的两个成员蘑菇和霉菌中提取的系统发育树的分类器进行了测试,从而表明该算法可以分离真实世界的系统发育树。Matlab 代码可通过以下网址访问:https://gitlab.com/avner.priel/vectorial-tree-distance 。