Simmons Mark P, Springer Mark S, Gatesy John
Department of Biology, Colorado State University, Fort Collins, CO 80523, USA.
Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, CA 92521, USA.
Mol Phylogenet Evol. 2022 Feb;167:107344. doi: 10.1016/j.ympev.2021.107344. Epub 2021 Nov 5.
Phylogenomic analyses of ancient rapid radiations can produce conflicting results that are driven by differential sampling of taxa and characters as well as the limitations of alternative analytical methods. We re-examine basal relationships of palaeognath birds (ratites and tinamous) using recently published datasets of nucleotide characters from 20,850 loci as well as 4301 retroelement insertions. The original studies attributed conflicting resolutions of rheas in their inferred coalescent and concatenation trees to concatenation failing in the anomaly zone. By contrast, we find that the coalescent-based resolution of rheas is premised upon extensive gene-tree estimation errors. Furthermore, retroelement insertions contain much more conflict than originally reported and multiple insertion loci support the basal position of rheas found in concatenation trees, while none were reported in the original publication. We demonstrate how even remarkable congruence in phylogenomic studies may be driven by long-branch misplacement of a divergent outgroup, highly incongruent gene trees, differential taxon sampling that can result in gene-tree misrooting errors that bias species-tree inference, and gross homology errors. What was previously interpreted as broad, robustly supported corroboration for a single resolution in coalescent analyses may instead indicate a common bias that taints phylogenomic results across multiple genome-scale datasets. The updated retroelement dataset now supports a species tree with branch lengths that suggest an ancient anomaly zone, and both concatenation and coalescent analyses of the huge nucleotide datasets fail to yield coherent, reliable results in this challenging phylogenetic context.
对古代快速辐射的系统基因组分析可能会产生相互矛盾的结果,这些结果是由分类群和特征的差异抽样以及替代分析方法的局限性所驱动的。我们使用最近发表的来自20850个基因座的核苷酸特征数据集以及4301个反转录元件插入,重新审视古颚类鸟类(平胸鸟和䳍)的基部关系。原始研究将美洲鸵在推断的合并树和串联树中相互矛盾的分辨率归因于串联在异常区失败。相比之下,我们发现基于合并的美洲鸵分辨率是基于广泛的基因树估计误差。此外,反转录元件插入包含的冲突比最初报道的要多得多,多个插入位点支持串联树中发现的美洲鸵基部位置,而原始出版物中未报道任何此类位点。我们展示了即使在系统基因组研究中显著的一致性也可能是由一个分化的外类群的长枝错误定位、高度不一致的基因树、可能导致基因树错误生根误差从而使物种树推断产生偏差的差异分类群抽样以及严重的同源性误差所驱动的。以前在合并分析中被解释为对单一分辨率的广泛、有力支持的证据,可能反而表明了一种普遍的偏差,这种偏差影响了多个基因组规模数据集的系统基因组结果。更新后的反转录元件数据集现在支持一个具有表明古老异常区的分支长度的物种树,并且在这个具有挑战性的系统发育背景下,对巨大核苷酸数据集的串联分析和合并分析都未能产生连贯、可靠的结果。