Suppr超能文献

最简单物种树问题的复杂性。

Complexity of the simplest species tree problem.

机构信息

National Center for Mathematics and Interdisciplinary Sciences, Institute of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.

Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.

出版信息

Mol Biol Evol. 2021 Aug 23;38(9):3993-4009. doi: 10.1093/molbev/msab009.

Abstract

The multispecies coalescent model provides a natural framework for species tree estimation accounting for gene-tree conflicts. Although a number of species tree methods under the multispecies coalescent have been suggested and evaluated using simulation, their statistical properties remain poorly understood. Here, we use mathematical analysis aided by computer simulation to examine the identifiability, consistency, and efficiency of different species tree methods in the case of three species and three sequences under the molecular clock. We consider four major species-tree methods including concatenation, two-step, independent-sites maximum likelihood, and maximum likelihood. We develop approximations that predict that the probit transform of the species tree estimation error decreases linearly with the square root of the number of loci. Even in this simplest case, major differences exist among the methods. Full-likelihood methods are considerably more efficient than summary methods such as concatenation and two-step. They also provide estimates of important parameters such as species divergence times and ancestral population sizes,whereas these parameters are not identifiable by summary methods. Our results highlight the need to improve the statistical efficiency of summary methods and the computational efficiency of full likelihood methods of species tree estimation.

摘要

多物种合并模型为解决基因树冲突提供了一个用于估计物种树的自然框架。尽管已经提出了许多基于多物种合并的物种树方法,并通过模拟进行了评估,但它们的统计性质仍未得到很好的理解。在这里,我们使用数学分析和计算机模拟来检查在分子钟的情况下,三种物种和三个序列下,不同物种树方法的可识别性、一致性和效率。我们考虑了四种主要的物种树方法,包括串联、两步法、独立站点最大似然法和最大似然法。我们开发了近似值,预测物种树估计误差的概率单位变换随位点数量的平方根线性减小。即使在这种最简单的情况下,方法之间也存在很大差异。全似然法比串联和两步等摘要方法效率高得多。它们还提供了重要参数的估计,如物种分歧时间和祖先种群大小,而这些参数是无法通过摘要方法识别的。我们的结果强调了需要提高摘要方法的统计效率和全似然法的计算效率,以便更好地估计物种树。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验