Department of Biology, Indiana University, Bloomington, Indiana 47405, USA.
Genetics. 2010 Sep;186(1):411-26. doi: 10.1534/genetics.110.118448. Epub 2010 Jun 15.
Bacterial gene content variation during the course of evolution has been widely acknowledged and its pattern has been actively modeled in recent years. Gene truncation or gene pseudogenization also plays an important role in shaping bacterial genome content. Truncated genes could also arise from small-scale lateral gene transfer events. Unfortunately, the information of truncated genes has not been considered in any existing mathematical models on gene content variation. In this study, we developed a model to incorporate truncated genes. Maximum-likelihood estimates (MLEs) of the new model reveal fast rates of gene insertions/deletions on recent branches, suggesting a fast turnover of many recently transferred genes. The estimates also suggest that many truncated genes are in the process of being eliminated from the genome. Furthermore, we demonstrate that the ignorance of truncated genes in the estimation does not lead to a systematic bias but rather has a more complicated effect. Analysis using the new model not only provides more accurate estimates on gene gains/losses (or insertions/deletions), but also reduces any concern of a systematic bias from applying simplified models to bacterial genome evolution. Although not a primary purpose, the model incorporating truncated genes could be potentially used for phylogeny reconstruction using gene family content.
在进化过程中,细菌基因内容的变化已经得到了广泛的认可,近年来其模式也得到了积极的建模。基因截断或基因伪基因化在塑造细菌基因组内容方面也起着重要作用。截断基因也可能来自小规模的水平基因转移事件。不幸的是,在任何关于基因内容变化的现有数学模型中,都没有考虑截断基因的信息。在这项研究中,我们开发了一个包含截断基因的模型。新模型的最大似然估计(MLE)揭示了近期分支上基因插入/缺失的快速速率,表明许多最近转移的基因正在快速更替。这些估计还表明,许多截断基因正在从基因组中被淘汰。此外,我们证明了在估计中忽略截断基因不会导致系统偏差,而是会产生更复杂的影响。使用新模型进行分析不仅可以更准确地估计基因的得失(或插入/缺失),还可以减少由于将简化模型应用于细菌基因组进化而产生的系统偏差的担忧。虽然不是主要目的,但包含截断基因的模型也可以用于基于基因家族内容的系统发育重建。