Fares Mario A, Byrne Kevin P, Wolfe Kenneth H
Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland.
Mol Biol Evol. 2006 Feb;23(2):245-53. doi: 10.1093/molbev/msj027. Epub 2005 Oct 5.
Whole-genome duplication (WGD) produces sets of gene pairs that are all of the same age. We therefore expect that phylogenetic trees that relate these pairs to their orthologs in other species should show a single consistent topology. However, a previous study of gene pairs formed by WGD in the yeast Saccharomyces cerevisiae found conflicting topologies among neighbor-joining (NJ) trees drawn from different loci and suggested that this conflict was the result of "asynchronous functional divergence" of duplicated genes (Langkjaer, R. B., P. F. Cliften, M. Johnston, and J. Piskur. 2003. Yeast genome duplication was followed by asynchronous differentiation of duplicated genes. Nature 421:848-852). Here, we test whether the conflicting topologies might instead be due to asymmetrical rates of evolution leading to long-branch attraction (LBA) artifacts in phylogenetic trees. We constructed trees for 433 pairs of WGD paralogs in S. cerevisiae with their single orthologs in Saccharomyces kluyveri and Candida albicans. We find a strong correlation between the asymmetry of evolutionary rates of a pair of S. cerevisiae paralogs and the topology of the tree inferred for that pair. Saccharomyces cerevisiae gene pairs with approximately equal rates of evolution tend to give phylogenies in which the WGD postdates the speciation between S. cerevisiae and S. kluyveri (B-trees), whereas trees drawn from gene pairs with asymmetrical rates tend to show WGD pre-dating this speciation (A-trees). Gene order data from throughout the genome indicate that the "A-trees" are artifacts, even though more than 50% of gene pairs are inferred to have this topology when the NJ method as implemented in ClustalW (i.e., with Poisson correction of distances) is used to construct the trees. This LBA artifact can be ameliorated, but not eliminated, by using gamma-corrected distances or by using maximum likelihood trees with robustness estimated by the Shimodaira-Hasegawa test. Tests for adaptive evolution indicated that positive selection might be the cause of rate asymmetry in a substantial fraction (19%) of the paralog pairs.
全基因组复制(WGD)产生的基因对集合都具有相同的年龄。因此,我们预期将这些基因对与其在其他物种中的直系同源基因联系起来的系统发育树应显示出单一一致的拓扑结构。然而,先前一项关于酿酒酵母中由WGD形成的基因对的研究发现,从不同基因座绘制的邻接法(NJ)树之间存在相互冲突的拓扑结构,并表明这种冲突是重复基因“异步功能分化”的结果(Langkjaer,R.B.,P.F.Cliften,M.Johnston和J.Piskur。2003年。酵母基因组复制后重复基因的异步分化。《自然》421:848 - 852)。在这里,我们测试相互冲突的拓扑结构是否可能反而归因于进化速率的不对称,从而导致系统发育树中出现长枝吸引(LBA)假象。我们构建了酿酒酵母中433对WGD旁系同源基因与其在克鲁维酵母和白色念珠菌中的单个直系同源基因的树。我们发现酿酒酵母一对旁系同源基因进化速率的不对称性与为该对基因推断的树的拓扑结构之间存在很强的相关性。进化速率大致相等的酿酒酵母基因对倾向于给出WGD发生在酿酒酵母和克鲁维酵母物种形成之后的系统发育树(B树),而从进化速率不对称的基因对绘制的树倾向于显示WGD发生在该物种形成之前(A树)。来自整个基因组的基因顺序数据表明,“A树”是假象,尽管当使用ClustalW中实现的NJ方法(即距离的泊松校正)构建树时,超过50%的基因对被推断具有这种拓扑结构。通过使用伽马校正距离或使用通过Shimodaira - Hasegawa检验估计稳健性的最大似然树,可以改善但不能消除这种LBA假象。适应性进化测试表明,正选择可能是相当一部分(19%)旁系同源基因对中速率不对称的原因。