Goloboff Pablo A, Szumik Claudia A
Unidad Ejecutora Lillo, Consejo Nacional de Investigaciones Científicas y Técnicas, Miguel Lillo 251, 4000, S.M. de Tucumán, Argentina.
Cladistics. 2016 Feb;32(1):82-89. doi: 10.1111/cla.12111. Epub 2015 Jan 30.
This paper examines a recent proposal to calculate supertrees by minimizing the sum of subtree prune-and-regraft distances to the input trees. The supertrees thus calculated may display groups present in a minority of the input trees but contradicted by the majority, or groups that are not supported by any input tree or combination of input trees. The proponents of the method themselves stated that these are serious problems of "matrix representation with parsimony", but they can in fact occur in their own method. The majority rule supertrees, being explicitly clade-based, cannot have these problems, and seem much more suited to retrieving common clades from a set of trees with different taxon sets. However, it is dubious that so-called majority rule supertrees can always be interpreted as displaying those clades present (or compatible with) with a majority of the trees. The majority rule consensus is always a median tree, in terms of the Robinson-Foulds distances (i.e. it minimizes the sum of Robinson-Foulds distances to the input trees). In contrast, majority rule supertrees may not be median-different, contradictory trees may minimize Robinson-Foulds distances, while their strict consensus does not. If being "majority" results from being median in Robinson-Foulds distances, this means that in the supertree setting a "majority" is ambiguously defined, sometimes achievable only by mutually contradictory trees.
本文考察了最近一项通过最小化子树剪枝与重接距离之和来计算超树的提议,该距离是相对于输入树而言的。如此计算出的超树可能会显示出少数输入树中存在但与多数输入树矛盾的类群,或者没有任何输入树或输入树组合支持的类群。该方法的支持者自己也表示,这些是“简约矩阵表示法”的严重问题,但实际上它们也可能出现在他们自己的方法中。基于明确的分支,多数规则超树不会有这些问题,而且似乎更适合从具有不同分类单元集的一组树中检索共同的分支。然而,所谓的多数规则超树是否总能被解释为显示了大多数树中存在(或与之兼容)的那些分支,这是值得怀疑的。就罗宾逊 - 福尔兹距离而言,多数规则一致树总是一棵中位数树(即它最小化了到输入树的罗宾逊 - 福尔兹距离之和)。相比之下,多数规则超树可能不是中位数树,相互矛盾的树可能会最小化罗宾逊 - 福尔兹距离,而它们的严格一致树却不会。如果“多数”是由罗宾逊 - 福尔兹距离中的中位数导致的,这意味着在超树设置中,“多数”的定义不明确,有时只能通过相互矛盾的树来实现。