Suppr超能文献

基于超矩阵无根三元组的快速一致种系发生树估计。

Fast and consistent estimation of species trees using supermatrix rooted triples.

机构信息

Center for Computational Medicine and Bioinformatics, University of Michigan, USA.

出版信息

Mol Biol Evol. 2010 Mar;27(3):552-69. doi: 10.1093/molbev/msp250. Epub 2009 Oct 15.

Abstract

Concatenated sequence alignments are often used to infer species-level relationships. Previous studies have shown that analysis of concatenated data using maximum likelihood (ML) can produce misleading results when loci have differing gene tree topologies due to incomplete lineage sorting. Here, we develop a polynomial time method that utilizes the modified mincut supertree algorithm to construct an estimated species tree from inferred rooted triples of concatenated alignments. We term this method SuperMatrix Rooted Triple (SMRT) and use the notation SMRT-ML when rooted triples are inferred by ML. We use simulations to investigate the performance of SMRT-ML under Jukes-Cantor and general time-reversible substitution models for four- and five-taxon species trees and also apply the method to an empirical data set of yeast genes. We find that SMRT-ML converges to the correct species tree in many cases in which ML on the full concatenated data set fails to do so. SMRT-ML can be conservative in that its output tree is often partially unresolved for problematic clades. We show analytically that when the species tree is clocklike and mutations occur under the Cavender-Farris-Neyman substitution model, as the number of genes increases, SMRT-ML is increasingly likely to infer the correct species tree even when the most likely gene tree does not match the species tree. SMRT-ML is therefore a computationally efficient and statistically consistent estimator of the species tree when gene trees are distributed according to the multispecies coalescent model.

摘要

串联序列比对常用于推断种间关系。先前的研究表明,由于不完全谱系分选,当基因树拓扑结构不同时,使用最大似然法(ML)对串联数据进行分析会产生误导性结果。在这里,我们开发了一种多项式时间方法,该方法利用改进的最小割超级树算法,从推断的串联比对的有根三联体构建估计的种系发生树。我们将这种方法称为 SuperMatrix Rooted Triple(SMRT),并在通过 ML 推断有根三联体时使用 SMRT-ML 表示法。我们使用模拟来研究 SMRT-ML 在 Jukes-Cantor 和广义时间可逆替代模型下对四联体和五联体种系发生树的性能,并且还将该方法应用于酵母基因的实证数据集。我们发现,在许多情况下,当对完整的串联数据集进行 ML 时无法正确推断出种系发生树,而 SMRT-ML 却可以正确推断出种系发生树。SMRT-ML 可能较为保守,因为对于有问题的分支,其输出树通常部分未解决。我们通过分析表明,当种系发生树为钟形且突变发生在 Cavender-Farris-Neyman 替代模型下时,随着基因数量的增加,即使最可能的基因树与种系发生树不匹配,SMRT-ML 也越来越有可能推断出正确的种系发生树。因此,当基因树根据多物种融合模型分布时,SMRT-ML 是一种计算效率高且统计一致的种系发生树估计量。

相似文献

4
Estimating species trees using approximate Bayesian computation.使用近似贝叶斯计算估计物种树。
Mol Phylogenet Evol. 2011 May;59(2):354-63. doi: 10.1016/j.ympev.2011.02.019. Epub 2011 Mar 21.

引用本文的文献

1
Terraces in species tree inference from gene trees.从基因树上推断物种树的阶。
BMC Ecol Evol. 2024 Nov 4;24(1):135. doi: 10.1186/s12862-024-02309-z.
10
Microbial sequence typing in the genomic era.基因组时代的微生物序列分型。
Infect Genet Evol. 2018 Sep;63:346-359. doi: 10.1016/j.meegid.2017.09.022. Epub 2017 Sep 21.

本文引用的文献

1
Phylogenetic analysis in the anomaly zone.异常区域的系统发育分析。
Syst Biol. 2009 Aug;58(4):452-60. doi: 10.1093/sysbio/syp034. Epub 2009 Jul 9.
7
BEST: Bayesian estimation of species trees under the coalescent model.BEST:在溯祖模型下物种树的贝叶斯估计。
Bioinformatics. 2008 Nov 1;24(21):2542-3. doi: 10.1093/bioinformatics/btn484. Epub 2008 Sep 17.
10
Rooted triple consensus and anomalous gene trees.有根三元共识和异常基因树。
BMC Evol Biol. 2008 Apr 25;8:118. doi: 10.1186/1471-2148-8-118.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验