Bryant David, Steel Mike
Department of Mathematics, University of Auckland, Private Bag 92010, Auckland, New Zealand.
IEEE/ACM Trans Comput Biol Bioinform. 2009 Jul-Sep;6(3):420-6. doi: 10.1109/TCBB.2009.32.
The Robinson-Foulds (RF) distance is by far the most widely used measure of dissimilarity between trees. Although the distribution of these distances has been investigated for 20 years, an algorithm that is explicitly polynomial time has yet to be described for computing the distribution for trees around a given tree. In this paper, we derive a polynomial-time algorithm for this distribution. We show how the distribution can be approximated by a Poisson distribution determined by the proportion of leaves that lie in "cherries" of the given tree. We also describe how our results can be used to derive normalization constants that are required in a recently proposed maximum likelihood approach to supertree construction.
罗宾逊 - 福尔兹(RF)距离是目前在衡量树之间差异时使用最为广泛的方法。尽管对这些距离的分布已经研究了20年,但尚未有明确的多项式时间算法来计算围绕给定树的树的分布。在本文中,我们推导了用于此分布的多项式时间算法。我们展示了如何通过由给定树的“樱桃”中叶子的比例所确定的泊松分布来近似该分布。我们还描述了如何利用我们的结果来推导最近提出的超树构建最大似然方法中所需的归一化常数。