Karlin S, Weinstock G M, Brendel V
Department of Mathematics, Stanford University, California 94305-2125, USA.
J Bacteriol. 1995 Dec;177(23):6881-93. doi: 10.1128/jb.177.23.6881-6893.1995.
RecA protein sequences from 62 eubacterial sources were compared with one another and relative to one archaebacterial RecA-like and a number of eukaryotic RecA-like sequences. Pairwise similarity scores were determined by a novel method based on significant segment pair alignment. The sequences of different species were grouped on the basis of mutually high similarity scores within groups and consistency of score ranges in comparison to other groups. Following this protocol, the gamma-proteobacteria can be subclassified into two major groups, those of mostly vertebrate hosts and those of mostly soil habitat. The alpha-proteobacterial sequences also divide into two distinct groups, whereas classification of the beta-proteobacteria is more complex. The gram-positive bacterial sequences split into three groups of low and three groups of high G+C genome content. However, neither the combined low-G+C-content nor the combined high-G+C-content group nor the aggregate of all gram-positive bacteria form homogeneous groups. The mycoplasma sequences score best with the Bacillus subtilis sequence, consistent with their presumed origin from a gram-positive ancestor. The eukaryotic RAD proteins generally show a single high-scoring segment pair with the proteobacterial RecA sequences around the ATP-binding domain. The bacteriophage T4 UvsX protein aligns best with RecA sequences on two segments disjoint from the ATP-binding domain. The distribution of the most highly conserved regions shared between RecA and noneubacterial RecA-like sequences suggests a mosaic character and evolution of RecA. The discussion considers some questions on the validity and consistency of bacterial classifications derived from RecA sequence comparisons.
对来自62种真细菌的RecA蛋白序列相互之间进行了比较,并与一种古细菌RecA样序列以及一些真核生物RecA样序列进行了比较。成对相似性得分通过一种基于显著片段对排列的新方法来确定。不同物种的序列根据组内相互之间的高相似性得分以及与其他组相比得分范围的一致性进行分组。按照这个方案,γ-变形菌可细分为两个主要组,即主要寄生于脊椎动物宿主的组和主要存在于土壤环境的组。α-变形菌的序列也分为两个不同的组,而β-变形菌的分类则更为复杂。革兰氏阳性菌的序列分为基因组G+C含量低的三组和高的三组。然而,无论是低G+C含量组的组合、高G+C含量组的组合,还是所有革兰氏阳性菌的集合都没有形成同质的组。支原体序列与枯草芽孢杆菌序列的得分最高,这与它们推测的来自革兰氏阳性祖先的起源一致。真核生物的RAD蛋白通常在ATP结合结构域周围与变形菌的RecA序列显示出一个单一的高分片段对。噬菌体T4的UvsX蛋白在与ATP结合结构域不连续的两个片段上与RecA序列的比对最佳。RecA与非细菌RecA样序列之间共享的最高度保守区域的分布表明RecA具有镶嵌特征和进化过程。讨论考虑了一些关于从RecA序列比较得出的细菌分类的有效性和一致性的问题。