Suppr超能文献

在系统发育估计中,考虑树形拓扑结构的不确定性对模型选择的决策理论方法影响甚微。

Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation.

作者信息

Abdo Zaid, Minin Vladimir N, Joyce Paul, Sullivan Jack

机构信息

Initiative in Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, Moscow.

出版信息

Mol Biol Evol. 2005 Mar;22(3):691-703. doi: 10.1093/molbev/msi050. Epub 2004 Nov 17.

Abstract

Currently available methods for model selection used in phylogenetic analysis are based on an initial fixed-tree topology. Once a model is picked based on this topology, a rigorous search of the tree space is run under that model to find the maximum-likelihood estimate of the tree (topology and branch lengths) and the maximum-likelihood estimates of the model parameters. In this paper, we propose two extensions to the decision-theoretic (DT) approach that relax the fixed-topology restriction. We also relax the fixed-topology restriction for the Bayesian information criterion (BIC) and the Akaike information criterion (AIC) methods. We compare the performance of the different methods (the relaxed, restricted, and the likelihood-ratio test [LRT]) using simulated data. This comparison is done by evaluating the relative complexity of the models resulting from each method and by comparing the performance of the chosen models in estimating the true tree. We also compare the methods relative to one another by measuring the closeness of the estimated trees corresponding to the different chosen models under these methods. We show that varying the topology does not have a major impact on model choice. We also show that the outcome of the two proposed extensions is identical and is comparable to that of the BIC, Extended-BIC, and DT. Hence, using the simpler methods in choosing a model for analyzing the data is more computationally feasible, with results comparable to the more computationally intensive methods. Another outcome of this study is that earlier conclusions about the DT approach are reinforced. That is, LRT, Extended-AIC, and AIC result in more complicated models that do not contribute to the performance of the phylogenetic inference, yet cause a significant increase in the time required for data analysis.

摘要

目前系统发育分析中用于模型选择的现有方法基于初始固定树拓扑结构。一旦基于此拓扑结构选择了一个模型,就会在该模型下对树空间进行严格搜索,以找到树(拓扑结构和分支长度)的最大似然估计以及模型参数的最大似然估计。在本文中,我们对决策理论(DT)方法提出了两种扩展,放宽了固定拓扑结构的限制。我们还放宽了贝叶斯信息准则(BIC)和赤池信息准则(AIC)方法的固定拓扑结构限制。我们使用模拟数据比较不同方法(放宽的、受限的和似然比检验[LRT])的性能。这种比较通过评估每种方法产生的模型的相对复杂性以及比较所选模型在估计真实树方面的性能来进行。我们还通过测量这些方法下不同所选模型对应的估计树的接近程度来相互比较这些方法。我们表明改变拓扑结构对模型选择没有重大影响。我们还表明,所提出的两种扩展的结果是相同的,并且与BIC、扩展BIC和DT的结果相当。因此,在选择用于分析数据的模型时使用更简单的方法在计算上更可行,其结果与计算量更大的方法相当。这项研究的另一个结果是强化了关于DT方法的早期结论。也就是说,LRT、扩展AIC和AIC会导致更复杂的模型,这些模型对系统发育推断的性能没有贡献,但会导致数据分析所需时间显著增加。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验