Suppr超能文献

最大似然估计物种树:系统发育推断的准确性如何取决于分歧历史和采样设计。

Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.

机构信息

Department of Ecology and Evolutionary Biology, and the Museum of Zoology, University of Michigan, Ann Arbor, MI 48109-1079, USA.

出版信息

Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20.

Abstract

The understanding that gene trees are often in discord with each other and with the species trees that contain them has led researchers to methods that incorporate the inherent stochasticity of genetic processes in the phylogenetic estimation procedure. Recently developed methods for species-tree estimation that not only consider the retention and sorting of ancestral polymorphism but also quantify the actual probabilities of incomplete lineage sorting are expected to provide an improvement over earlier summary-statistic based approaches that discard much of the information content of gene trees. However, these new methods have yet to be tested on truly challenging evolutionary histories such as those marked by recent rapid speciation where high levels of incomplete lineage sorting and discord among gene trees predominate. Here, we test a new maximum-likelihood method that incorporates stochastic models of both nucleotide substitution and lineage sorting for species-tree estimation. Using a simulation approach, we consider a broad range of species-tree topologies under 2 scenarios representing moderate and severe incomplete lineage sorting. We show that the maximum-likelihood method results in more accurate species trees than a summary-statistic based approach, demonstrating that information contained in discordant gene trees can be effectively extracted using a full probabilistic model. Moreover, we demonstrate that the shape of the original species tree (i.e., the relative lengths of internal branches) has a significant impact on whether the species tree is estimated accurately. In the speciation histories explored here, it is not just the recent origin of species that affects the accuracy of the estimates but the variance in relative species divergence times as well. Additionally, we show that sampling effort (number of individuals and/or loci) and sampling design (ratio of individuals to loci) are both important factors affecting the accuracy of species-tree estimates, which is again affected by the relative timing of divergence among species. The inherent difficulties of estimating relationships when species have undergone a recent radiation are discussed, and in particular, the limitations with maximum-likelihood estimates of species trees that do not consider uncertainty in the estimated gene trees of individual loci. Thus, despite substantial improvements over current summary-statistic based approaches, and the increased sophistication of procedures that incorporate the process of gene lineage coalescence, recent radiations still appear to pose daunting challenges for phylogenetics.

摘要

对基因树经常彼此不一致且与包含它们的物种树不一致的认识,促使研究人员采用在系统发育估计过程中纳入遗传过程固有随机性的方法。最近开发的物种树估计方法不仅考虑了祖先多态性的保留和排序,而且还量化了不完全谱系分选的实际概率,预计将比早期基于汇总统计信息的方法有所改进,这些方法丢弃了基因树的大部分信息内容。然而,这些新方法尚未在真正具有挑战性的进化历史(例如最近快速物种形成的历史)中进行测试,其中高水平的不完全谱系分选和基因树之间的分歧占主导地位。在这里,我们测试了一种新的最大似然方法,该方法结合了核苷酸替代和谱系分选的随机模型进行物种树估计。使用模拟方法,我们考虑了两种代表中度和严重不完全谱系分选的情景下的广泛物种树拓扑结构。我们表明,最大似然方法产生的物种树比基于汇总统计信息的方法更准确,这表明可以使用完整的概率模型有效地提取分歧基因树中的信息。此外,我们证明了原始物种树的形状(即内部分支的相对长度)对准确估计物种树具有重大影响。在探索的这里的物种形成历史中,不仅是物种的最近起源会影响估计的准确性,而且相对物种分歧时间的变化也会影响。此外,我们表明,采样量(个体和/或基因座的数量)和采样设计(个体与基因座的比例)都是影响物种树估计准确性的重要因素,而这又受到物种间分歧的相对时间的影响。讨论了在物种经历最近辐射时估计关系所固有的困难,特别是考虑了不考虑个体基因座估计基因树不确定性的物种树最大似然估计的局限性。因此,尽管与当前基于汇总统计信息的方法相比有了很大的改进,并且纳入基因谱系合并过程的程序也更加复杂,但最近的辐射似乎仍然给系统发育学带来了严峻的挑战。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验