Subramanian G, Koonin E V, Aravind L
Laboratory of Parasitic Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20894, USA.
Infect Immun. 2000 Mar;68(3):1633-48. doi: 10.1128/IAI.68.3.1633-1648.2000.
A comparative analysis of the predicted protein sequences encoded in the complete genomes of Borrelia burgdorferi and Treponema pallidum provides a number of insights into evolutionary trends and adaptive strategies of the two spirochetes. A measure of orthologous relationships between gene sets, termed the orthology coefficient (OC), was developed. The overall OC value for the gene sets of the two spirochetes is about 0.43, which means that less than one-half of the genes show readily detectable orthologous relationships. This emphasizes significant divergence between the two spirochetes, apparently driven by different biological niches. Different functional categories of proteins as well as different protein families show a broad distribution of OC values, from near 1 (a perfect, one-to-one correspondence) to near 0. The proteins involved in core biological functions, such as genome replication and expression, typically show high OC values. In contrast, marked variability is seen among proteins that are involved in specific processes, such as nutrient transport, metabolism, gene-specific transcription regulation, signal transduction, and host response. Differences in the gene complements encoded in the two spirochete genomes suggest active adaptive evolution for their distinct niches. Comparative analysis of the spirochete genomes produced evidence of gene exchanges with other bacteria, archaea, and eukaryotic hosts that seem to have occurred at different points in the evolution of the spirochetes. Examples are presented of the use of sequence profile analysis to predict proteins that are likely to play a role in pathogenesis, including secreted proteins that contain specific protein-protein interaction domains, such as von Willebrand A, YWTD, TPR, and PR1, some of which hitherto have been reported only in eukaryotes. We tentatively reconstruct the likely evolutionary process that has led to the divergence of the two spirochete lineages; this reconstruction seems to point to an ancestral state resembling the symbiotic spirochetes found in insect guts.
对伯氏疏螺旋体和梅毒螺旋体完整基因组中编码的预测蛋白质序列进行比较分析,为这两种螺旋体的进化趋势和适应性策略提供了许多见解。开发了一种衡量基因集之间直系同源关系的方法,称为直系同源系数(OC)。这两种螺旋体基因集的总体OC值约为0.43,这意味着不到一半的基因显示出易于检测到的直系同源关系。这强调了两种螺旋体之间的显著差异,显然是由不同的生态位驱动的。不同功能类别的蛋白质以及不同的蛋白质家族显示出OC值的广泛分布,从接近1(完美的一对一对应)到接近0。参与核心生物学功能(如基因组复制和表达)的蛋白质通常显示出高OC值。相比之下,参与特定过程(如营养物质运输、代谢、基因特异性转录调控、信号转导和宿主反应)的蛋白质则表现出明显的变异性。两种螺旋体基因组中编码的基因互补差异表明它们针对不同生态位进行了积极的适应性进化。对螺旋体基因组的比较分析产生了与其他细菌、古菌和真核宿主进行基因交换的证据,这些交换似乎发生在螺旋体进化的不同阶段。文中给出了使用序列谱分析来预测可能在致病过程中起作用的蛋白质的例子,包括含有特定蛋白质-蛋白质相互作用结构域(如血管性血友病因子A、YWTD、TPR和PR1)的分泌蛋白,其中一些迄今为止仅在真核生物中报道过。我们初步重建了导致两种螺旋体谱系分化的可能进化过程;这种重建似乎指向一种类似于在昆虫肠道中发现的共生螺旋体的祖先状态。