Hahn Matthew W, De Bie Tijl, Stajich Jason E, Nguyen Chi, Cristianini Nello
Center for Population Biology, University of California, Davis, California 95616, USA.
Genome Res. 2005 Aug;15(8):1153-60. doi: 10.1101/gr.3567505.
Comparison of whole genomes has revealed that changes in the size of gene families among organisms is quite common. However, there are as yet no models of gene family evolution that make it possible to estimate ancestral states or to infer upon which lineages gene families have contracted or expanded. In addition, large differences in family size have generally been attributed to the effects of natural selection, without a strong statistical basis for these conclusions. Here we use a model of stochastic birth and death for gene family evolution and show that it can be efficiently applied to multispecies genome comparisons. This model takes into account the lengths of branches on phylogenetic trees, as well as duplication and deletion rates, and hence provides expectations for divergence in gene family size among lineages. The model offers both the opportunity to identify large-scale patterns in genome evolution and the ability to make stronger inferences regarding the role of natural selection in gene family expansion or contraction. We apply our method to data from the genomes of five yeast species to show its applicability.
全基因组比较显示,生物间基因家族大小的变化相当普遍。然而,目前尚无基因家族进化模型能够估计祖先状态,或推断基因家族在哪些谱系中发生了收缩或扩张。此外,家族大小的巨大差异通常被归因于自然选择的作用,但这些结论缺乏强有力的统计依据。在此,我们使用一个基因家族进化的随机生死模型,并表明它可以有效地应用于多物种基因组比较。该模型考虑了系统发育树上分支的长度,以及复制和缺失率,从而为谱系间基因家族大小的差异提供了预期。该模型既提供了识别基因组进化中大规模模式的机会,也提供了对自然选择在基因家族扩张或收缩中的作用进行更有力推断的能力。我们将我们的方法应用于来自五个酵母物种基因组的数据,以展示其适用性。