Wolsan Mieczyslaw, Sato Jun J
Museum and Institute of Zoology, Polish Academy of Sciences, Wilcza 64, 00-679 Warszawa, Poland.
Laboratory of Animal Cell Technology, Faculty of Life Science and Technology, Fukuyama University, Higashimura-cho, Aza, Sanzo, 985, Fukuyama 729-0292, Japan.
Cladistics. 2010 Apr;26(2):168-194. doi: 10.1111/j.1096-0031.2009.00281.x. Epub 2009 Sep 1.
Missing data are commonly thought to impede a resolved or accurate reconstruction of phylogenetic relationships, and probabilistic analysis techniques are increasingly viewed as less vulnerable to the negative effects of data incompleteness than parsimony analyses. We test both assumptions empirically by conducting parsimony and Bayesian analyses on an approximately 1.5 × 10 -cell (27 965 characters × 52 species) mustelid-procyonid molecular supermatrix with 62.7% missing entries. Contrary to the first assumption, phylogenetic relationships inferred from our analyses are fully (Bayesian) or almost fully (parsimony) resolved topologically with mostly strong support and also largely in accord with prior molecular estimations of mustelid and procyonid phylogeny derived with parsimony, Bayesian, and other probabilistic analysis techniques from smaller but complete or nearly complete data sets. Contrary to the second assumption, we found no compelling evidence in support of a relationship between the inferior performance of parsimony and taxon incompleteness (i.e. the proportion of missing character data for a taxon), although we found evidence for a connection between the inferior performance of parsimony and character incompleteness (i.e. no overlap in character data between some taxa). The relatively good performance of our analyses may be related to the large number of sampled characters, so that most taxa (even highly incomplete ones) are represented by a sufficient number of characters allowing both approaches to resolve their relationships. © The Willi Hennig Society 2009.
缺失数据通常被认为会妨碍系统发育关系的确定或准确重建,而且概率分析技术越来越被视为比分简约分析更不易受到数据不完整的负面影响。我们通过对一个约有1.5×10个单元格(27965个字符×52个物种)、缺失数据条目达62.7%的鼬科-浣熊科分子超级矩阵进行简约分析和贝叶斯分析,对这两个假设进行实证检验。与第一个假设相反,我们分析得出的系统发育关系在拓扑结构上完全(贝叶斯分析)或几乎完全(简约分析)得到解决,且大多有强有力的支持,并且在很大程度上也与之前用简约分析、贝叶斯分析和其他概率分析技术从较小但完整或几乎完整的数据集中得出的鼬科和浣熊科系统发育的分子估计结果一致。与第二个假设相反,我们没有找到令人信服的证据支持简约分析的较差表现与分类单元不完整性(即一个分类单元缺失字符数据的比例)之间存在关联,尽管我们发现了简约分析的较差表现与字符不完整性(即一些分类单元之间字符数据没有重叠)之间存在联系的证据。我们分析的相对良好表现可能与大量抽样字符有关,这样大多数分类单元(即使是高度不完整的分类单元)都由足够数量的字符来代表,从而使两种方法都能解析它们的关系。© 威利·亨尼希协会2009年。