Sección Paleontología de Vertebrados, CONICET-Museo Argentino de Ciencias Naturales, Ángel Gallardo 470, C1405DJR, Ciudad Autónoma de Buenos Aires, Argentina.
School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, B15 2TT, Birmingham, UK.
Cladistics. 2024 Jun;40(3):242-281. doi: 10.1111/cla.12581. Epub 2024 May 10.
Although simulations have shown that implied weighting (IW) outperforms equal weighting (EW) in phylogenetic parsimony analyses, weighting against homoplasy lacks extensive usage in palaeontology. Iterative modifications of several phylogenetic matrices in the last decades resulted in extensive genealogies of datasets that allow the evaluation of differences in the stability of results for alternative character weighting methods directly on empirical data. Each generation was compared against the most recent generation in each genealogy because it is assumed that it is the most comprehensive (higher sampling), revised (fewer misscorings) and complete (lower amount of missing data) matrix of the genealogy. The analyses were conducted on six different genealogies under EW and IW and extended implied weighting (EIW) with a range of concavity constant values (k) between 3 and 30. Pairwise comparisons between trees were conducted using Robinson-Foulds distances normalized by the total number of groups, distortion coefficient, subtree pruning and regrafting moves, and the proportional sum of group dissimilarities. The results consistently show that IW and EIW produce results more similar to those of the last dataset than EW in the vast majority of genealogies and for all comparative measures. This is significant because almost all of these matrices were originally analysed only under EW. Implied weighting and EIW do not outperform each other unambiguously. Euclidean distances based on a principal components analysis of the comparative measures show that different ranges of k-values retrieve the most similar results to the last generation in different genealogies. There is a significant positive linear correlation between the optimal k-values and the number of terminals of the last generations. This could be employed to inform about the range of k-values to be used in phylogenetic analyses based on matrix size but with the caveat that this emergent relationship still relies on a low sample size of genealogies.
尽管模拟表明,在系统发育简约分析中,隐含赋权(IW)优于等权重(EW),但反同形性加权在古生物学中尚未得到广泛应用。在过去几十年中,对几个系统发育矩阵进行了迭代修改,产生了广泛的数据集系统发育,允许直接在经验数据上评估替代字符加权方法的结果稳定性差异。在每个系统发育中,每一代都与最近的一代进行比较,因为假设这是该系统发育中最全面(更高的采样)、修订(更少的错误)和完整(更少的数据缺失)的矩阵。在 EW 和 IW 下对六个不同的系统发育进行了分析,并使用范围在 3 到 30 之间的凹度常数值(k)扩展了隐含加权(EIW)。使用通过总组数、扭曲系数、子树修剪和重新连接移动以及组相似度总和归一化的罗宾逊-福尔德距离进行树之间的成对比较。结果一致表明,在绝大多数系统发育中,对于所有比较度量,IW 和 EIW 比 EW 产生的结果更接近最后一个数据集。这很重要,因为几乎所有这些矩阵最初都仅在 EW 下进行分析。隐含加权和 EIW 并没有明确地相互超越。基于比较度量的主成分分析的欧几里得距离表明,不同的 k 值范围在不同的系统发育中检索到与最后一代最相似的结果。最佳 k 值与最后一代的终端数量之间存在显著的正线性相关。这可以用来告知基于矩阵大小的系统发育分析中要使用的 k 值范围,但需要注意的是,这种新兴关系仍然依赖于系统发育的小样本量。