• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在系统发育估计中,考虑树形拓扑结构的不确定性对模型选择的决策理论方法影响甚微。

Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation.

作者信息

Abdo Zaid, Minin Vladimir N, Joyce Paul, Sullivan Jack

机构信息

Initiative in Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, Moscow.

出版信息

Mol Biol Evol. 2005 Mar;22(3):691-703. doi: 10.1093/molbev/msi050. Epub 2004 Nov 17.

DOI:10.1093/molbev/msi050
PMID:15548751
Abstract

Currently available methods for model selection used in phylogenetic analysis are based on an initial fixed-tree topology. Once a model is picked based on this topology, a rigorous search of the tree space is run under that model to find the maximum-likelihood estimate of the tree (topology and branch lengths) and the maximum-likelihood estimates of the model parameters. In this paper, we propose two extensions to the decision-theoretic (DT) approach that relax the fixed-topology restriction. We also relax the fixed-topology restriction for the Bayesian information criterion (BIC) and the Akaike information criterion (AIC) methods. We compare the performance of the different methods (the relaxed, restricted, and the likelihood-ratio test [LRT]) using simulated data. This comparison is done by evaluating the relative complexity of the models resulting from each method and by comparing the performance of the chosen models in estimating the true tree. We also compare the methods relative to one another by measuring the closeness of the estimated trees corresponding to the different chosen models under these methods. We show that varying the topology does not have a major impact on model choice. We also show that the outcome of the two proposed extensions is identical and is comparable to that of the BIC, Extended-BIC, and DT. Hence, using the simpler methods in choosing a model for analyzing the data is more computationally feasible, with results comparable to the more computationally intensive methods. Another outcome of this study is that earlier conclusions about the DT approach are reinforced. That is, LRT, Extended-AIC, and AIC result in more complicated models that do not contribute to the performance of the phylogenetic inference, yet cause a significant increase in the time required for data analysis.

摘要

目前系统发育分析中用于模型选择的现有方法基于初始固定树拓扑结构。一旦基于此拓扑结构选择了一个模型,就会在该模型下对树空间进行严格搜索,以找到树(拓扑结构和分支长度)的最大似然估计以及模型参数的最大似然估计。在本文中,我们对决策理论(DT)方法提出了两种扩展,放宽了固定拓扑结构的限制。我们还放宽了贝叶斯信息准则(BIC)和赤池信息准则(AIC)方法的固定拓扑结构限制。我们使用模拟数据比较不同方法(放宽的、受限的和似然比检验[LRT])的性能。这种比较通过评估每种方法产生的模型的相对复杂性以及比较所选模型在估计真实树方面的性能来进行。我们还通过测量这些方法下不同所选模型对应的估计树的接近程度来相互比较这些方法。我们表明改变拓扑结构对模型选择没有重大影响。我们还表明,所提出的两种扩展的结果是相同的,并且与BIC、扩展BIC和DT的结果相当。因此,在选择用于分析数据的模型时使用更简单的方法在计算上更可行,其结果与计算量更大的方法相当。这项研究的另一个结果是强化了关于DT方法的早期结论。也就是说,LRT、扩展AIC和AIC会导致更复杂的模型,这些模型对系统发育推断的性能没有贡献,但会导致数据分析所需时间显著增加。

相似文献

1
Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation.在系统发育估计中,考虑树形拓扑结构的不确定性对模型选择的决策理论方法影响甚微。
Mol Biol Evol. 2005 Mar;22(3):691-703. doi: 10.1093/molbev/msi050. Epub 2004 Nov 17.
2
Does choice in model selection affect maximum likelihood analysis?模型选择中的选择会影响最大似然分析吗?
Syst Biol. 2008 Feb;57(1):76-85. doi: 10.1080/10635150801898920.
3
Assessment of substitution model adequacy using frequentist and Bayesian methods.使用频率论和贝叶斯方法评估替代模型的充分性。
Mol Biol Evol. 2010 Dec;27(12):2790-803. doi: 10.1093/molbev/msq168. Epub 2010 Jul 8.
4
Investigating the performance of AIC in selecting phylogenetic models.研究赤池信息准则(AIC)在选择系统发育模型方面的性能。
Stat Appl Genet Mol Biol. 2014 Aug;13(4):459-75. doi: 10.1515/sagmb-2013-0048.
5
Evaluating the performance of a successive-approximations approach to parameter optimization in maximum-likelihood phylogeny estimation.评估在最大似然系统发育估计中用于参数优化的逐次逼近法的性能。
Mol Biol Evol. 2005 Jun;22(6):1386-92. doi: 10.1093/molbev/msi129. Epub 2005 Mar 9.
6
Reliable estimation of tree branch lengths using deep neural networks.利用深度神经网络可靠估计树枝长度。
PLoS Comput Biol. 2024 Aug 5;20(8):e1012337. doi: 10.1371/journal.pcbi.1012337. eCollection 2024 Aug.
7
On the Use of Information Criteria for Model Selection in Phylogenetics.关于信息准则在系统发育学模型选择中的应用。
Mol Biol Evol. 2020 Feb 1;37(2):549-562. doi: 10.1093/molbev/msz228.
8
Topology selection in unrooted molecular phylogenetic tree by minimum model-based complexity method.基于最小模型复杂度方法的无根分子系统发育树拓扑结构选择
Pac Symp Biocomput. 1999:326-37. doi: 10.1142/9789814447300_0032.
9
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.最大似然估计物种树:系统发育推断的准确性如何取决于分歧历史和采样设计。
Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20.
10
Performance-based selection of likelihood models for phylogeny estimation.基于性能的系统发育估计似然模型选择
Syst Biol. 2003 Oct;52(5):674-83. doi: 10.1080/10635150390235494.

引用本文的文献

1
Using PhyloSuite for molecular phylogeny and tree-based analyses.使用PhyloSuite进行分子系统发育和基于树的分析。
Imeta. 2023 Feb 16;2(1):e87. doi: 10.1002/imt2.87. eCollection 2023 Feb.
2
The Tree Reconstruction Game: Phylogenetic Reconstruction Using Reinforcement Learning.树重建游戏:使用强化学习进行系统发育重建。
Mol Biol Evol. 2024 Jun 1;41(6). doi: 10.1093/molbev/msae105.
3
A Guide to Phylogenomic Inference.系统发育基因组推断指南。
Methods Mol Biol. 2024;2802:267-345. doi: 10.1007/978-1-0716-3838-5_11.
4
Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty.进化替换模型的相对模型选择可能对多重序列比对的不确定性敏感。
BMC Ecol Evol. 2021 Nov 29;21(1):214. doi: 10.1186/s12862-021-01931-5.
5
Coding-Gene Coevolution Analysis of Rotavirus Proteins: A Bioinformatics and Statistical Approach.轮状病毒蛋白编码基因共进化分析:一种生物信息学和统计学方法。
Genes (Basel). 2019 Dec 24;11(1):28. doi: 10.3390/genes11010028.
6
The effect of alignment uncertainty, substitution models and priors in building and dating the mammal tree of life.在构建和定时代哺乳动物系统发育树时,配准不确定性、替代模型和先验概率的影响。
BMC Evol Biol. 2019 Nov 6;19(1):203. doi: 10.1186/s12862-019-1534-9.
7
Model selection may not be a mandatory step for phylogeny reconstruction.模型选择可能不是系统发育重建的强制性步骤。
Nat Commun. 2019 Feb 25;10(1):934. doi: 10.1038/s41467-019-08822-w.
8
A simple method for data partitioning based on relative evolutionary rates.一种基于相对进化速率的数据划分简单方法。
PeerJ. 2018 Aug 28;6:e5498. doi: 10.7717/peerj.5498. eCollection 2018.
9
Comparative Phylogenomic Assessment of Mitochondrial Introgression among Several Species of Chipmunks (Tamias).花鼠属(Tamias)几种物种间线粒体基因渐渗的比较系统基因组评估
Genome Biol Evol. 2017 Jan 1;9(1):7-19. doi: 10.1093/gbe/evw254.
10
Automatic selection of partitioning schemes for phylogenetic analyses using iterative k-means clustering of site rates.使用位点速率的迭代k均值聚类法进行系统发育分析时自动选择分区方案。
BMC Evol Biol. 2015 Feb 10;15(1):13. doi: 10.1186/s12862-015-0283-7.