Puechberty Jacques, Blaineau Christine, Meghamla Sabrina, Crobu Lucien, Pagès Michel, Bastien Patrick
CNRS/Université Montpellier, Biologie Moléculaire, Biologie Cellulaire et Biodiversité des Protozoaires Parasites, Laboratoire de Parasitologie-Mycologie, UFR Médecine, Montpellier, France.
BMC Genomics. 2007 Feb 24;8:57. doi: 10.1186/1471-2164-8-57.
Trypanosomatids exhibit a unique gene organization into large directional gene clusters (DGCs) in opposite directions. The transcription "strand switch region" (SSR) separating the two large DGCs that constitute chromosome 1 of Leishmania major has been the subject of several studies and speculations. Thus, it has been suspected of being the single replication origin of the chromosome, the transcription initiation site for both DGCs or even a centromere. Here, we have used an inter-species compared genomics approach on this locus in order to try to identify conserved features or motifs indicative of a putative function.
We isolated, and compared the structure and nucleotide sequence of, this SSR in 15 widely divergent species of Leishmania and Sauroleishmania. As regards its intrachromosomal position, size and AT content, the general structure of this SSR appears extremely stable among species, which is another demonstration of the remarkable structural stability of these genomes at the evolutionary level. Sequence alignments showed several interesting features. Overall, only 30% of nucleotide positions were conserved in the SSR among the 15 species, versus 74% and 62% in the 5' parts of the adjacent XPP and PAXP genes, respectively. However, nucleotide divergences were not distributed homogeneously along this sequence. Thus, a central fragment of approximately 440 bp exhibited 54% of identity among the 15 species. This fragment actually represents a new Leishmania-specific CDS of unknown function which had been overlooked since the annotation of this chromosome. The encoded protein comprises two trans-membrane domains and is classified in the "structural protein" GO category. We cloned this novel gene and expressed it as a recombinant green fluorescent protein-fused version, which showed its localisation to the endoplasmic reticulum. The whole of these data shorten the actual SSR to an 887-bp segment as compared with the original 1.6 kb. In the rest of the SSR, the percentage of identity was much lower, around 22%. Interestingly, the 72-bp fragment where the putatively single transcription initiation site of chromosome 1 was identified is located in a low-conservation portion of the SSR and is itself highly polymorphic amongst species. Nevertheless, it is highly C-rich and presents a unique poly(C) tract in the same position in all species.
This inter-specific comparative study, the first of its kind, (a) allowed to reveal a novel genus-specific gene and (b) identified a conserved poly(C) tract in the otherwise highly polymorphic region containing the putative transcription initiation site. This allows hypothesising an intervention of poly(C)-binding proteins known elsewhere to be involved in transcriptional control.
锥虫呈现出独特的基因组织形式,形成方向相反的大型定向基因簇(DGC)。分隔构成硕大利什曼原虫1号染色体的两个大型DGC的转录“链转换区”(SSR)一直是多项研究和推测的主题。因此,有人怀疑它是该染色体的唯一复制起点、两个DGC的转录起始位点,甚至是一个着丝粒。在此,我们对该基因座采用了种间比较基因组学方法,试图识别指示推定功能的保守特征或基序。
我们分离并比较了15种广泛分化的利什曼原虫和蜥蜴利什曼原虫物种中该SSR的结构和核苷酸序列。就其在染色体内的位置、大小和AT含量而言,该SSR的总体结构在物种间显得极其稳定,这是这些基因组在进化水平上具有显著结构稳定性的又一证明。序列比对显示了几个有趣的特征。总体而言,15个物种的SSR中仅有30%的核苷酸位置保守,而相邻XPP和PAXP基因5'部分的保守率分别为74%和62%。然而,核苷酸差异在该序列上并非均匀分布。因此,一个约440 bp的中央片段在15个物种间表现出54%的同一性。该片段实际上代表了一个功能未知的新的利什曼原虫特异性编码序列,自该染色体注释以来一直被忽视。编码的蛋白质包含两个跨膜结构域,归类于“结构蛋白”GO类别。我们克隆了这个新基因,并将其表达为重组绿色荧光蛋白融合形式,显示其定位于内质网。与原来的1.6 kb相比,所有这些数据将实际的SSR缩短至一个887 bp的片段。在SSR的其余部分,同一性百分比要低得多,约为22%。有趣的是,鉴定出的推定的1号染色体单转录起始位点所在的72 bp片段位于SSR的低保守部分,并且在物种间本身高度多态。然而,它富含C,并且在所有物种的相同位置呈现出独特的聚(C)序列。
这项同类研究中的首次种间比较研究,(a)揭示了一个新的属特异性基因,(b)在包含推定转录起始位点的高度多态区域中鉴定出一个保守的聚(C)序列。这使得我们可以推测已知在其他地方参与转录调控的聚(C)结合蛋白的干预作用。