University of Freiburg, Center for Biosystems Analysis, Habsburgerstrasse 49, Germany.
Genome Biol Evol. 2012;4(4):443-56. doi: 10.1093/gbe/evs016. Epub 2012 Feb 21.
The distributed genome hypothesis states that the gene pool of a bacterial taxon is much more complex than that found in a single individual genome. However, the possible fitness advantage, why such genomic diversity is maintained, whether this variation is largely adaptive or neutral, and why these distinct individuals can coexist, remains poorly understood. Here, we present the infinitely many genes (IMG) model, which is a quantitative, evolutionary model for the distributed genome. It is based on a genealogy of individual genomes and the possibility of gene gain (from an unbounded reservoir of novel genes, e.g., by horizontal gene transfer from distant taxa) and gene loss, for example, by pseudogenization and deletion of genes, during reproduction. By implementing these mechanisms, the IMG model differs from existing concepts for the distributed genome, which cannot differentiate between neutral evolution and adaptation as drivers of the observed genomic diversity. Using the IMG model, we tested whether the distributed genome of 22 full genomes of picocyanobacteria (Prochlorococcus and Synechococcus) shows signs of adaptation or neutrality. We calculated the effective population size of Prochlorococcus at 1.01 × 10(11) and predicted 18 distinct clades for this population, only six of which have been isolated and cultured thus far. We predicted that the Prochlorococcus pangenome contains 57,792 genes and found that the evolution of the distributed genome of Prochlorococcus was possibly neutral, whereas that of Synechococcus and the combined sample shows a clear deviation from neutrality.
分布式基因组假说指出,细菌分类群的基因库比单个个体基因组中发现的基因库要复杂得多。然而,这种基因组多样性是如何维持的,这种变异在很大程度上是适应性的还是中性的,以及为什么这些不同的个体能够共存,这些问题仍然知之甚少。在这里,我们提出了无限基因(IMG)模型,这是一个分布式基因组的定量进化模型。它基于个体基因组的系统发育和基因获得的可能性(来自无限数量的新基因,例如通过水平基因转移从遥远的分类群获得)和基因丢失,例如通过基因的假基因化和缺失,在繁殖过程中。通过实现这些机制,IMG 模型与现有的分布式基因组概念不同,后者无法区分中性进化和适应作为观察到的基因组多样性的驱动因素。使用 IMG 模型,我们测试了 22 个蓝藻(聚球藻和集胞藻)的完整基因组的分布式基因组是否显示出适应或中性的迹象。我们计算了聚球藻的有效种群大小为 1.01×10(11),并预测了这个种群有 18 个不同的分支,迄今为止只有 6 个被分离和培养。我们预测聚球藻的泛基因组包含 57792 个基因,并发现聚球藻分布式基因组的进化可能是中性的,而集胞藻和合并样本的进化则明显偏离了中性。