Jones D, Russnak R H, Kay R J, Candido E P
J Biol Chem. 1986 Sep 15;261(26):12006-15.
A locus containing two hsp16 genes in Caenorhabditis elegans has been characterized by DNA sequencing. Each gene encodes a 16-kDa polypeptide which is expressed following heat induction. The two genes, designated hsp16-2 and hsp16-41, are arranged in divergent orientations, and each contains a single intron of 46 and 58 base pairs, respectively. Although both gene transcripts are spliced efficiently in vivo, hsp16-41 corresponds to a previously isolated cDNA which contains an unspliced intron sequence. The 5'-noncoding regions of both genes contain TATA boxes preceded 18 or 19 nucleotides upstream by a heat shock regulatory sequence. The 3'-noncoding regions contain polyadenylation signals (AATAAA) either downstream (hsp16-2) or immediately adjacent (hsp16-41) to a sequence capable of forming a hairpin. This pair of hsp16 genes is flanked by three copies of an approximately 200-bp dispersed repetitive element (two copies on one side and a single one on the other side of the locus) which occurs in at least 70 copies throughout the C. elegans genome, and has been designated CeRep-16. Together with data described previously (Russnak, R. H., and Candido, E. P. M. (1985) Mol. Cell. Biol. 5, 1268-1278), the results presented here define a family of four distinct, related small heat shock protein genes. These are arranged in divergently transcribed pairs at two loci. The hsp16-48/41 genes code for one class of HSP16, 143-amino acid residues long, while the hsp16-1/2 genes encode the other class, which is 2 amino acid residues longer. Thus each locus codes for the two major types of HSP16. The two loci differ in a number of respects, including the presence of a tandem inverted duplication of two heat shock protein genes at one locus, and of repetitive elements at the other. Sequence comparisons allow us to propose a scheme for the evolution of the four genes and reveal conserved features of noncoding regions which may be involved in the regulation of their transcription, RNA processing, or translation. Using locus-specific hybridization probes, we have found that the genes at locus hsp16-2/41 are expressed at levels approximately 20-40-fold higher than those at locus hsp16-1/48.
秀丽隐杆线虫中一个包含两个hsp16基因的基因座已通过DNA测序进行了表征。每个基因编码一个16 kDa的多肽,该多肽在热诱导后表达。这两个基因分别命名为hsp16 - 2和hsp16 - 41,呈反向排列,且每个基因分别包含一个46和58个碱基对的单一内含子。尽管这两个基因的转录本在体内都能有效剪接,但hsp16 - 41对应于一个先前分离的cDNA,其中包含未剪接的内含子序列。这两个基因的5'非编码区都含有TATA框,在其上游18或19个核苷酸处有一个热休克调控序列。3'非编码区在能够形成发夹结构的序列下游(hsp16 - 2)或紧邻该序列(hsp16 - 41)处含有多聚腺苷酸化信号(AATAAA)。这一对hsp16基因两侧各有三个拷贝的约200 bp的分散重复元件(基因座一侧有两个拷贝,另一侧有一个拷贝),这种元件在秀丽隐杆线虫基因组中至少有70个拷贝,已被命名为CeRep - 16。结合先前描述的数据(Russnak, R. H., and Candido, E. P. M. (1985) Mol. Cell. Biol. 5, 1268 - 1278),本文给出的结果定义了一个由四个不同但相关的小热休克蛋白基因组成的家族。它们在两个基因座处以反向转录对的形式排列。hsp16 - 48/41基因编码一类HSP16,长143个氨基酸残基,而hsp16 - 1/2基因编码另一类,长2个氨基酸残基。因此每个基因座编码两种主要类型的HSP16。这两个基因座在许多方面存在差异,包括一个基因座处有两个热休克蛋白基因的串联反向重复,另一个基因座处有重复元件。序列比较使我们能够提出这四个基因的进化方案,并揭示非编码区的保守特征,这些特征可能参与它们的转录、RNA加工或翻译调控。使用基因座特异性杂交探针,我们发现基因座hsp16 - 2/41处的基因表达水平比基因座hsp16 - 1/48处的基因高约20 - 40倍。