• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

正确进化树估计概率的大样本近似以及最大似然估计的偏差

Large sample approximations of probabilities of correct evolutionary tree estimation and biases of maximum likelihood estimation.

作者信息

Susko Edward

机构信息

Dalhousie University.

出版信息

Stat Appl Genet Mol Biol. 2011;10:Article 10. doi: 10.2202/1544-6115.1626.

DOI:10.2202/1544-6115.1626
PMID:21381435
Abstract

Simulation studies have been the main way in which properties of maximum likelihood estimation of evolutionary trees from aligned sequence data have been studied. Because trees are unusual parameters and because fitting is computationally intensive, such studies have a heavy computational cost. We develop an asymptotic framework that can be used to obtain probabilities of correct topological reconstruction and study other properties of likelihood methods when a single split is poorly resolved. Simulations suggest that while approximations to log likelihood differences are better for less well-resolved topologies, approximations to probabilities of correct reconstruction are generally good. We used the approximations to investigate biases in estimation and found that maximum likelihood estimation has a long-branch-repels bias. This differs from the long-branch-attracts bias often reported in the literature because it is a different form of bias. For maximum likelihood estimation, usually long-branch-attracts bias results arise in the presence of model misspecification and are a form of statistical inconsistency where the estimated tree converges upon an incorrect tree with long edges together. Here, by bias we mean a tendency to favour a particular topology when data are generated from a four-taxon star tree. While we find a tendency to favour the tree with long branches apart, with more extreme long edges, a strong small sequence-length long-branch-attracts bias overwhelms the long-branch-repels bias. The long-branch-repels bias generalizes to five and six taxa in the sense that subtrees containing taxa that are all distant from the poorly resolved split repel each other.

摘要

模拟研究一直是研究从比对序列数据中进行进化树最大似然估计性质的主要方式。由于树是特殊的参数,且拟合计算量很大,此类研究的计算成本很高。我们开发了一个渐近框架,可用于在单个分裂解析度较差时获得正确拓扑重建的概率,并研究似然方法的其他性质。模拟表明,虽然对数似然差异的近似值对于解析度较低的拓扑更好,但正确重建概率的近似值通常也不错。我们使用这些近似值来研究估计中的偏差,发现最大似然估计存在长枝排斥偏差。这与文献中经常报道的长枝吸引偏差不同,因为它是一种不同形式的偏差。对于最大似然估计,通常长枝吸引偏差结果出现在模型错误设定的情况下,并且是一种统计不一致的形式,其中估计的树收敛于具有长边缘的不正确树。这里,偏差是指当数据从四分类单元星型树生成时倾向于支持特定拓扑的趋势。虽然我们发现倾向于支持长枝分开的树,长边缘更极端,但强烈的小序列长度长枝吸引偏差压倒了长枝排斥偏差。长枝排斥偏差在某种意义上推广到了五分类单元和六分类单元,即包含与解析度较差的分裂都相距较远的分类单元的子树相互排斥。

相似文献

1
Large sample approximations of probabilities of correct evolutionary tree estimation and biases of maximum likelihood estimation.正确进化树估计概率的大样本近似以及最大似然估计的偏差
Stat Appl Genet Mol Biol. 2011;10:Article 10. doi: 10.2202/1544-6115.1626.
2
Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation.基于相对分支长度差异和模型违背情况下蛋白质序列数据的贝叶斯和最大似然系统发育分析。
BMC Evol Biol. 2005 Jan 28;5:8. doi: 10.1186/1471-2148-5-8.
3
Quartet-mapping, a generalization of the likelihood-mapping procedure.四重映射,似然映射程序的一种推广。
Mol Biol Evol. 2001 Jul;18(7):1204-19. doi: 10.1093/oxfordjournals.molbev.a003907.
4
On inconsistency of the neighbor-joining, least squares, and minimum evolution estimation when substitution processes are incorrectly modeled.当替代过程建模错误时邻接法、最小二乘法和最小进化估计的不一致性
Mol Biol Evol. 2004 Sep;21(9):1629-42. doi: 10.1093/molbev/msh159. Epub 2004 May 21.
5
When being "most likely" is not enough: examining the performance of three uses of the parametric bootstrap in phylogenetics.当“最有可能”还不够时:考察系统发育学中参数自助法三种用法的性能。
J Mol Evol. 2003 Feb;56(2):198-222. doi: 10.1007/s00239-002-2394-1.
6
Phylogenetic analysis using parsimony and likelihood methods.使用简约法和似然法进行系统发育分析。
J Mol Evol. 1996 Feb;42(2):294-307. doi: 10.1007/BF02198856.
7
Topological estimation biases with covarion evolution.具有协变进化的拓扑估计偏差。
J Mol Evol. 2008 Jan;66(1):50-60. doi: 10.1007/s00239-007-9062-4. Epub 2007 Dec 14.
8
Long branch effects distort maximum likelihood phylogenies in simulations despite selection of the correct model.尽管选择了正确的模型,长枝效应仍会在模拟中扭曲最大似然系统发育。
PLoS One. 2012;7(5):e36593. doi: 10.1371/journal.pone.0036593. Epub 2012 May 9.
9
Maximum likelihood inference of small trees in the presence of long branches.最大似然推断在长枝存在下的小树。
Syst Biol. 2014 Sep;63(5):798-811. doi: 10.1093/sysbio/syu044. Epub 2014 Jul 4.
10
On the distributions of bootstrap support and posterior distributions for a star tree.关于星树的自举支持分布和后验分布。
Syst Biol. 2008 Aug;57(4):602-12. doi: 10.1080/10635150802302468.

引用本文的文献

1
More on the Best Evolutionary Rate for Phylogenetic Analysis.关于系统发育分析的最佳进化速率的更多内容。
Syst Biol. 2017 Sep 1;66(5):769-785. doi: 10.1093/sysbio/syx051.
2
On the distribution of interspecies correlation for Markov models of character evolution on Yule trees.关于尤尔树上性状进化的马尔可夫模型的种间相关性分布
J Theor Biol. 2015 Jan 7;364:275-83. doi: 10.1016/j.jtbi.2014.09.016. Epub 2014 Sep 18.