Institut de Systématique, Evolution, Biodiversité ISYEB - UMR 7205 - MNHN CNRS UPMC EPHE, Muséum national d'Histoire naturelle, Sorbonne Universités, CP50, 57 rue Cuvier, 75005 Paris, France.
Department of Biological Sciences, Rutgers, The State University of New Jersey, 195 University Ave., Newark, NJ 07102, United States.
Mol Phylogenet Evol. 2018 Nov;128:112-122. doi: 10.1016/j.ympev.2018.05.007. Epub 2018 Jun 30.
Assessing support for molecular phylogenies is difficult because the data is heterogeneous in quality and overwhelming in quantity. Traditionally, node support values (bootstrap frequency, Bayesian posterior probability) are used to assess confidence in tree topologies. Other analyses to assess the quality of phylogenetic data (e.g. Lento plots, saturation plots, trait consistency) and the resulting phylogenetic trees (e.g. internode certainty, parameter permutation tests, topological tests) exist but are rarely applied. Here we argue that a single qualitative analysis is insufficient to assess support of a phylogenetic hypothesis and relate data quality to tree quality. We use six molecular markers to infer the phylogeny of Blattodea and apply various tests to assess relationship support, locus quality, and the relationship between the two. We use internode-certainty calculations in conjunction with bootstrap scores, alignment permutations, and an approximately unbiased (AU) test to assess if the molecular data unambiguously support the phylogenetic relationships found. Our results show higher support for the position of Lamproblattidae, high support for the termite phylogeny, and low support for the position of Anaplectidae, Corydioidea and phylogeny of Blaberoidea. We use Lento plots in conjunction with mutation-saturation plots, calculations of locus homoplasy to assess locus quality, identify long branch attraction, and decide if the tree's relationships are the result of data biases. We conclude that multiple tests and metrics need to be taken into account to assess tree support and data robustness.
评估分子系统发育的支持度较为困难,因为数据在质量和数量上都存在异质性。传统上,使用节点支持值(自举频率、贝叶斯后验概率)来评估树拓扑结构的置信度。其他用于评估系统发育数据质量(例如 Lento 图、饱和图、性状一致性)和由此产生的系统发育树(例如节点确定性、参数置换检验、拓扑检验)的分析方法虽然存在,但很少被应用。在这里,我们认为单一的定性分析不足以评估系统发育假说的支持度,也无法将数据质量与树质量联系起来。我们使用六个分子标记来推断蜚蠊目昆虫的系统发育,并应用各种测试来评估关系支持度、基因座质量以及两者之间的关系。我们使用节点确定性计算与自举分数、排列组合以及近似无偏(AU)检验相结合,以评估分子数据是否明确支持发现的系统发育关系。我们的结果表明 Lamproblattidae 的位置具有较高的支持度,白蚁的系统发育具有较高的支持度,而 Anaplectidae、Corydioidea 和 Blaberoidea 的系统发育则具有较低的支持度。我们结合突变饱和图、基因座同质性计算以及长枝吸引的 Lento 图,来评估基因座质量,识别长枝吸引,并确定树的关系是否是数据偏差的结果。我们得出结论,需要综合考虑多种测试和指标来评估树的支持度和数据稳健性。