Carstens Bryan C, Knowles L Lacey
Department of Ecology and Evolutionary Biology, Museum of Zoology, University of Michigan, Ann Arbor, MI 48109-1079, USA.
Syst Biol. 2007 Jun;56(3):400-11. doi: 10.1080/10635150701405560.
Estimating phylogenetic relationships among closely related species can be extremely difficult when there is incongruence among gene trees and between the gene trees and the species tree. Here we show that incorporating a model of the stochastic loss of gene lineages by genetic drift into the phylogenetic estimation procedure can provide a robust estimate of species relationships, despite widespread incomplete sorting of ancestral polymorphism. This approach is applied to a group of montane Melanoplus grasshoppers for which genealogical discordance among loci and incomplete lineage sorting obscures any obvious phylogenetic relationships among species. Unlike traditional treatments where gene trees estimated using standard phylogenetic methods are implicitly equated with the species tree, with the coalescent-based approach the species tree is modeled probabilistically from the estimated gene trees. The estimated species phylogeny (the ESP) is calculated for the grasshoppers from multiple gene trees reconstructed for nuclear loci and a mitochondrial gene. This empirical application is coupled with a simulation study to explore the performance of the coalescent-based approach. Specifically, we test the accuracy of the ESP given the data based on analyses of simulated data matching the multilocus data collected in Melanoplus (i.e., data were simulated for each locus with the same number of base pairs and locus-specific mutational models). The results of the study show that ESPs can be computed using the coalescent-based approach long before reciprocal monophyly has been achieved, and that these statistical estimates are accurate. This contrasts with analyses of the empirical data collected in Melanoplus and simulated data based on concatenation of multiple loci, for which the incomplete lineage sorting of recently diverged species posed significant problems. The strengths and potential challenges associated with incorporating an explicit model of gene-lineage coalescence into the phylogenetic procedure to obtain an ESP, as illustrated by application to Melanoplus, versus concatenation and consensus approaches are discussed. This study represents a fundamental shift in how species relationships are estimated - the relationship between the gene trees and the species phylogeny is modeled probabilistically rather than equating gene trees with a species tree.
当基因树之间以及基因树与物种树之间存在不一致时,估计亲缘关系较近的物种之间的系统发育关系可能极其困难。在这里,我们表明,将基因谱系因遗传漂变而随机丢失的模型纳入系统发育估计过程,可以提供对物种关系的稳健估计,尽管祖先多态性普遍存在不完全分选的情况。这种方法应用于一组山地笨蝗,对于这些笨蝗,基因座之间的谱系不一致和不完全谱系分选掩盖了物种之间任何明显的系统发育关系。与传统方法不同,传统方法中使用标准系统发育方法估计的基因树被隐含地等同于物种树,而基于溯祖理论的方法是根据估计的基因树对物种树进行概率建模。根据为核基因座和一个线粒体基因重建的多个基因树,计算了笨蝗的估计物种系统发育(ESP)。这个实证应用与一项模拟研究相结合,以探索基于溯祖理论的方法的性能。具体来说,我们基于对与在笨蝗中收集的多位点数据相匹配的模拟数据的分析,测试了给定数据时ESP的准确性(即,为每个基因座模拟具有相同碱基对数和基因座特异性突变模型的数据)。研究结果表明,在远未达到相互单系性之前,就可以使用基于溯祖理论的方法计算ESP,并且这些统计估计是准确的。这与对在笨蝗中收集的实证数据以及基于多个基因座串联的模拟数据的分析形成对比,对于这些数据,最近分化物种的不完全谱系分选带来了重大问题。讨论了将基因谱系合并的显式模型纳入系统发育过程以获得ESP与串联和共识方法相比的优势和潜在挑战,如应用于笨蝗所示。这项研究代表了估计物种关系方式的根本转变——基因树与物种系统发育之间的关系是通过概率建模,而不是将基因树等同于物种树。