IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):569-582. doi: 10.1109/TCBB.2018.2790957. Epub 2018 Jan 8.
Tree comparison metrics are an important tool for the study of phylogenetic trees. Path-difference distances measure the dissimilarity between two phylogenetic trees (on the same set of taxa) by comparing their path-length vectors. Various norms can be applied to this distance. Three important examples are the $l_{1}\text{-},;l_{2}\text{-}$l1-,l2-, and $l_{{\infty }}$l∞-norms. The previous best algorithms for computing path-difference distances all have $O(n^{2})$O(n2) running time. In this paper, we show how to compute the $l_{1}$l1-norm path-difference distance in $O(n;{\log}^{2};n)$O(nlog2n) time and how to compute the $l_{2}$l2- and $l_{{\infty }}$l∞-norm path-difference distances in $O(n;{\log};n)$O(nlogn) time. By extending the presented algorithms, we also show that the $l_{p}$lp-norm path-difference distance can be computed in $O(pn;{\log}^{2};n)$O(pnlog2n) time for any positive integer $p$p. In addition, when the integer $p$p is even, we show that the distance can be computed in $O(p^{2}n;{\log};n)$O(p2nlogn) time as well.
树比较度量标准是研究系统发育树的重要工具。路径差异距离通过比较两个系统发育树(在同一组分类单元上)的路径长度向量来衡量它们之间的不相似性。可以对这个距离应用各种范数。三个重要的例子是 $l_{1}$-、$l_{2}$- 和 $l_{{\infty }}$-范数。以前计算路径差异距离的最佳算法的运行时间都是 $O(n^{2})$。在本文中,我们展示了如何在 $O(n;{\log}^{2};n)$ 的时间内计算 $l_{1}$-范数路径差异距离,以及如何在 $O(n;{\log};n)$ 的时间内计算 $l_{2}$-和 $l_{{\infty }}$-范数路径差异距离。通过扩展提出的算法,我们还表明,对于任何正整数 $p$,可以在 $O(pn;{\log}^{2};n)$ 的时间内计算出 $l_{p}$-范数路径差异距离。此外,当整数 $p$ 为偶数时,我们还表明,该距离也可以在 $O(p^{2}n;{\log};n)$ 的时间内计算出来。