Lanier Hayley C, Knowles L Lacey
Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, Ann Arbor, MI 48109-1079, USA.
Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, Ann Arbor, MI 48109-1079, USA.
Mol Phylogenet Evol. 2015 Feb;83:191-9. doi: 10.1016/j.ympev.2014.10.022. Epub 2014 Nov 4.
Coalescent-based methods for species-tree estimation are becoming a dominant approach for reconstructing species histories from multi-locus data, with most of the studies examining these methodologies focused on recently diverged species. However, deeper phylogenies, such as the datasets that comprise many Tree of Life (ToL) studies, also exhibit gene-tree discordance. This discord may also arise from the stochastic sorting of gene lineages during the speciation process (i.e., reflecting the random coalescence of gene lineages in ancestral populations). It remains unknown whether guidelines regarding methodologies and numbers of loci established by simulation studies at shallow tree depths translate into accurate species relationships for deeper phylogenetic histories. We address this knowledge gap and specifically identify the challenges and limitations of species-tree methods that account for coalescent variance for deeper phylogenies. Using simulated data with characteristics informed by empirical studies, we evaluate both the accuracy of estimated species trees and the characteristics associated with recalcitrant nodes, with a specific focus on whether coalescent variance is generally responsible for the lack of resolution. By determining the proportion of coalescent genealogies that support a particular node, we demonstrate that (1) species-tree methods account for coalescent variance at deep nodes and (2) mutational variance - not gene-tree discord arising from the coalescent - posed the primary challenge for accurate reconstruction across the tree. For example, many nodes were accurately resolved despite predicted discord from the random coalescence of gene lineages and nodes with poor support were distributed across a range of depths (i.e., they were not restricted to a particular recent divergences). Given their broad taxonomic scope and large sampling of taxa, deep level phylogenies pose several potential methodological complications including difficulties with MCMC convergence and estimation of requisite population genetic parameters for coalescent-based approaches. Despite these difficulties, the findings generally support the utility of species-tree analyses for the estimation of species relationships throughout the ToL. We discuss strategies for successful application of species-tree approaches to deep phylogenies.
基于溯祖理论的物种树估计方法正成为从多位点数据重建物种历史的主导方法,大多数研究这些方法的研究都集中在最近分化的物种上。然而,更深层次的系统发育,例如许多生命之树(ToL)研究中的数据集,也表现出基因树不一致。这种不一致也可能源于物种形成过程中基因谱系的随机分选(即反映祖先种群中基因谱系的随机合并)。尚不清楚浅树深度模拟研究建立的关于方法和基因座数量的指导方针是否能转化为更深入系统发育历史的准确物种关系。我们填补了这一知识空白,并特别确定了考虑溯祖变异的物种树方法在更深入系统发育中的挑战和局限性。使用具有实证研究特征的模拟数据,我们评估了估计物种树的准确性以及与顽固节点相关的特征,特别关注溯祖变异是否通常是缺乏分辨率的原因。通过确定支持特定节点的溯祖谱系的比例,我们证明:(1)物种树方法考虑了深层节点的溯祖变异;(2)突变变异——而非溯祖导致的基因树不一致——对整棵树的准确重建构成了主要挑战。例如,尽管预测基因谱系的随机合并会导致不一致,但许多节点仍被准确解析,支持度差的节点分布在一系列深度(即它们不限于特定的近期分化)。鉴于其广泛的分类范围和大量的分类群采样,深层次的系统发育带来了几个潜在的方法学复杂性,包括MCMC收敛困难以及基于溯祖方法所需种群遗传参数的估计。尽管存在这些困难,但研究结果总体上支持物种树分析在整个生命之树中估计物种关系的实用性。我们讨论了将物种树方法成功应用于深层次系统发育的策略。