Ren Fengrong, Tanaka Hiroshi, Yang Ziheng
Advanced Biomedical Information, Center for Information Medicine, Tokyo Medical and Dental University, Tokyo, Japan.
Gene. 2009 Jul 15;441(1-2):119-25. doi: 10.1016/j.gene.2008.04.002. Epub 2008 Apr 10.
Supermatrix and supertree methods are two strategies advocated for phylogenetic analysis of sequence data from multiple gene loci, especially when some species are missing at some loci. The supermatrix method concatenates sequences from multiple genes into a data supermatrix for phylogenetic analysis, and ignores differences in evolutionary dynamics among the genes. The supertree method analyzes each gene separately and assembles the subtrees estimated from individual genes into a supertree for all species. Most algorithms suggested for supertree construction lack statistical justifications and ignore uncertainties in the subtrees. Instead of supermatrix or supertree, we advocate the use of likelihood function to combine data from multiple genes while accommodating their differences in the evolutionary process. This combines the strengths of the supermatrix and supertree methods while avoiding their drawbacks. We conduct computer simulation to evaluate the performance of the supermatrix, supertree, and maximum likelihood methods applied to two phylogenetic problems: molecular-clock dating of species divergences and reconstruction of species phylogenies. The results confirm the theoretical superiority of the likelihood method. Supertree or separate analyses of data of multiple genes may be useful in revealing the characteristics of the evolutionary process of multiple gene loci, and the information may be used to formulate realistic models for combined analysis of all genes by likelihood.
超级矩阵法和超级树法是两种用于对来自多个基因座的序列数据进行系统发育分析的策略,尤其是当某些物种在某些基因座上缺失时。超级矩阵法将多个基因的序列连接成一个数据超级矩阵用于系统发育分析,而忽略了基因之间进化动态的差异。超级树法分别分析每个基因,并将从单个基因估计的子树组装成所有物种的超级树。大多数建议用于构建超级树的算法缺乏统计学依据,并且忽略了子树中的不确定性。我们主张使用似然函数来组合来自多个基因的数据,同时考虑它们在进化过程中的差异,而不是使用超级矩阵法或超级树法。这结合了超级矩阵法和超级树法的优点,同时避免了它们的缺点。我们进行计算机模拟,以评估应用于两个系统发育问题的超级矩阵法、超级树法和最大似然法的性能:物种分歧的分子钟定年和物种系统发育的重建。结果证实了似然法的理论优越性。超级树法或对多个基因的数据进行单独分析可能有助于揭示多个基因座进化过程的特征,这些信息可用于构建通过似然法对所有基因进行联合分析时的现实模型。