Brochier Céline, Forterre Patrick, Gribaldo Simonetta
Laboratoire Evolution, Génomique, Environnement, Université Aix-Marseille I, Centre Saint-Charles, Case 36, 3 Place Victor Hugo, 13331 Marseille, Cedex 3, France.
BMC Evol Biol. 2005 Jun 2;5:36. doi: 10.1186/1471-2148-5-36.
The concept of a genomic core, defined as the set of genes ubiquitous in all genomes of a monophyletic group, has become crucial in comparative and evolutionary genomics. However, it is still a matter of debate whether lateral gene transfers (LGT) may affect the components of genomic cores, preventing their use to retrace species evolution. We have recently reconstructed the phylogeny of Archaea by using two large concatenated datasets of core proteins involved in translation and transcription, respectively. The resulting trees were largely congruent, showing that informational gene components of the archaeal genomic core belonging to two distinct molecular systems contain a coherent signal for archaeal phylogeny. However, some incongruence remained between the two phylogenies. This may be due either to undetected LGT and/or to a lack of sufficient phylogenetic signal in the datasets.
We present evidence strongly favoring of the latter hypothesis. In fact, we have updated our transcription and translation datasets with five new archaeal genomes for a total of 6384 and 2928 amino acid positions, respectively, and 25 taxa. This increase in taxonomic sampling led to the nearly complete convergence of the transcription-based and translation-based trees on a single phylogenetic pattern for archaeal evolution. In fact, only a single incongruence persisted between the two phylogenies. This concerned Methanopyrus kandleri, whose placement remained strongly biased in the transcription tree due to its above average evolutionary rates, and could not be counterbalanced due to the lack of availability of closely related and/or slower-evolving relatives.
To our knowledge, this is the first report of evidence that the phylogenetic signal harbored by components of the archaeal translation apparatus is confirmed by additional markers belonging to a second molecular system (i.e. transcription). This rules out the risk of circularity when inferring species evolution by small subunit ribosomal RNA and ribosomal protein sequences, since it has been suggested that concerted LGT may affect these markers. Our results strongly support the existence of a core of proteins that has evolved mainly through vertical inheritance in Archaea, and carries a bona fide phylogenetic signal that can be used to retrace the evolutionary history of this domain. The identification and analysis of additional molecular markers not affected by LGT should continue defining the emerging picture of a genuine phylogenetic core for the third domain of life.
基因组核心的概念,即单系类群所有基因组中普遍存在的一组基因,在比较基因组学和进化基因组学中已变得至关重要。然而,横向基因转移(LGT)是否会影响基因组核心的组成部分,从而阻碍其用于追溯物种进化,仍是一个有争议的问题。我们最近分别使用参与翻译和转录的两个大型串联核心蛋白数据集重建了古菌的系统发育。所得的树在很大程度上是一致的,表明属于两个不同分子系统的古菌基因组核心的信息基因组成部分包含用于古菌系统发育的连贯信号。然而,这两个系统发育之间仍存在一些不一致。这可能是由于未检测到的LGT和/或数据集中缺乏足够的系统发育信号。
我们提供的证据强烈支持后一种假设。事实上,我们分别用五个新的古菌基因组更新了我们的转录和翻译数据集,总共有6384个和2928个氨基酸位置,以及25个分类单元。分类采样的这种增加导致基于转录和基于翻译的树在古菌进化的单一系统发育模式上几乎完全趋同。实际上,两个系统发育之间仅存在一个不一致。这涉及坎氏甲烷嗜热菌,由于其高于平均水平的进化速率,其在转录树中的位置仍然存在强烈偏差,并且由于缺乏密切相关和/或进化较慢的亲属而无法得到平衡。
据我们所知,这是第一份证据报告,表明古菌翻译装置的组成部分所携带的系统发育信号得到了属于第二个分子系统(即转录)的其他标记的证实。这排除了通过小亚基核糖体RNA和核糖体蛋白序列推断物种进化时循环论证的风险,因为有人提出协同LGT可能会影响这些标记。我们的结果有力地支持了存在一个主要通过垂直遗传在古菌中进化的蛋白质核心,并且携带一个可以用于追溯该领域进化历史的真实系统发育信号。对不受LGT影响的其他分子标记的鉴定和分析应继续完善生命第三域真正系统发育核心的新图景。