Suppr超能文献

新算法和方法估计最大似然系统发育:评估 PhyML 3.0 的性能。

New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

机构信息

Méthodes et Algorithmes pour la Bioinformatique, LIRMM, Centre National de la Recherche Scientifique, Université de Montpellier, Montpellier Cedex 5, France.

出版信息

Syst Biol. 2010 May;59(3):307-21. doi: 10.1093/sysbio/syq010. Epub 2010 Mar 29.

Abstract

PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasonable starting tree topology. Since the original publication (Guindon S., Gascuel O. 2003. A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704), PhyML has been widely used (>2500 citations in ISI Web of Science) because of its simplicity and a fair compromise between accuracy and speed. In the meantime, research around PhyML has continued, and this article describes the new algorithms and methods implemented in the program. First, we introduce a new algorithm to search the tree space with user-defined intensity using subtree pruning and regrafting topological moves. The parsimony criterion is used here to filter out the least promising topology modifications with respect to the likelihood function. The analysis of a large collection of real nucleotide and amino acid data sets of various sizes demonstrates the good performance of this method. Second, we describe a new test to assess the support of the data for internal branches of a phylogeny. This approach extends the recently proposed approximate likelihood-ratio test and relies on a nonparametric, Shimodaira-Hasegawa-like procedure. A detailed analysis of real alignments sheds light on the links between this new approach and the more classical nonparametric bootstrap method. Overall, our tests show that the last version (3.0) of PhyML is fast, accurate, stable, and ready to use. A Web server and binary files are available from http://www.atgc-montpellier.fr/phyml/.

摘要

PhyML 是一款基于最大似然原理的系统发生软件。早期的 PhyML 版本使用一种快速算法执行最近邻交换,以改进合理的起始树拓扑结构。自原始出版物(Guindon S.,Gascuel O. 2003. 一种简单、快速、准确的算法,通过最大似然法估计大型系统发育树。系统生物学 52:696-704)以来,由于其简单性以及准确性和速度之间的公平折衷,PhyML 得到了广泛应用(ISI Web of Science 中有超过 2500 次引用)。同时,围绕 PhyML 的研究仍在继续,本文介绍了该程序中实现的新算法和方法。首先,我们引入了一种新算法,使用子树剪枝和重新连接拓扑移动,以用户定义的强度搜索树空间。这里使用简约准则来过滤掉与似然函数相比最不有前途的拓扑修改。对各种大小的真实核苷酸和氨基酸数据集的大量分析表明了该方法的良好性能。其次,我们描述了一种新的测试方法来评估数据对系统发育内部分支的支持。这种方法扩展了最近提出的近似似然比检验,并依赖于非参数、Shimodaira-Hasegawa 样程序。对真实比对的详细分析揭示了这种新方法与更经典的非参数自举方法之间的联系。总体而言,我们的测试表明 PhyML 的最新版本(3.0)快速、准确、稳定且易于使用。Web 服务器和二进制文件可从 http://www.atgc-montpellier.fr/phyml/ 获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验