Wang Y D, Zhao S, Hill C W
Department of Biochemistry and Molecular Biology, Pennsylvania State College of Medicine, Hershey, Pennsylvania 17033, USA.
J Bacteriol. 1998 Aug;180(16):4102-10. doi: 10.1128/JB.180.16.4102-4110.1998.
The Rhs elements are complex genetic composites widely spread among Escherichia coli isolates. One of their components, a 3.7-kb, GC-rich core, maintains a single open reading frame that extends the full length of the core and then 400 to 600 bp beyond into an AT-rich region. Whereas Rhs cores are homologous, core extensions from different elements are dissimilar. Two new Rhs elements from strains of the ECOR reference collection have been characterized. RhsG (from strain ECOR-11) maps to min 5.3, and RhsH (from strain ECOR-45) maps to min 32.8, where it lies in tandem with RhsE. Comparison of strain K-12 to ECOR-11 indicates that RhsG was once present in but has been largely deleted from an ancestor of K-12. Phylogenetic analysis shows that the cores from eight known elements fall into three subfamilies, RhsA-B-C-F, RhsD-E, and RhsG-H. Cores from different subfamilies diverge 22 to 29%. Analysis of substitutions that distinguish between subfamilies shows that the origin of the ancestral core as well as the process of subfamily separation occurred in a GC-rich background. Furthermore, each subfamily independently passed from the GC-rich background to a less GC-rich background such as E. coli. A new example of core-extension shuffling provides the first example of exchange between cores of different subfamilies. A novel component of RhsE and RhsG, vgr, encodes a large protein distinguished by 18 to 19 repetitions of a Val-Gly dipeptide occurring with a eight-residue periodicity.
Rhs元件是广泛分布于大肠杆菌分离株中的复杂遗传复合体。其组成部分之一是一个3.7 kb、富含GC的核心区域,该区域维持着一个单一的开放阅读框,此开放阅读框延伸至核心区域的全长,然后再延伸400至600 bp进入富含AT的区域。虽然Rhs核心区域是同源的,但不同元件的核心延伸部分却不相同。已对来自ECOR参考菌株集菌株的两个新Rhs元件进行了表征。RhsG(来自菌株ECOR - 11)定位于5.3分钟处,而RhsH(来自菌株ECOR - 45)定位于32.8分钟处,它与RhsE串联排列。将K - 12菌株与ECOR - 11进行比较表明,RhsG曾存在于K - 12的一个祖先中,但已在很大程度上被删除。系统发育分析表明,来自八个已知元件的核心区域可分为三个亚家族,即RhsA - B - C - F、RhsD - E和RhsG - H。不同亚家族的核心区域差异为22%至29%。对区分亚家族的替换分析表明,祖先核心区域的起源以及亚家族分离过程发生在富含GC的背景中。此外,每个亚家族独立地从富含GC的背景转变为GC含量较低的背景,如大肠杆菌。核心延伸重排的一个新例子提供了不同亚家族核心区域之间交换的首个实例。RhsE和RhsG的一个新组分vgr编码一种大蛋白,其特征在于Val - Gly二肽以八个残基的周期重复出现18至19次。