Sainova O V, Mekhedov S L, Zhelnin L G, Khokhlova T A, Anan'ev E V
Genetika. 1993 Jul;29(7):1070-9.
The nucleotide sequence of barley C-hordein gene lambda CH4 and its flanking regions of 2820 bp length was determined. The gene contains no introns and codes for 310 amino acid long polypeptide. The 94% of the deduced amino acid sequence of the mature protein (291 amino acids) is made up of a repeating octapeptide motiff, PQQPEPQQ, which is repeated throughout the peptide chain between a unique 12 amino acid long NH2 terminal and a unique 6 amino acid long COOH-terminal end. In the 5' non-coding region there are TATA-, AGGA-, CAAT-and "endosperm" boxes. The 3' non-coding region has two polyadenylation signals. Compared with the published C-hordein sequences, our gene contains a number of insertions, deletions and substitutions. In the 3'-untranslated region there are two insertions 157 and 23 bp long. For the longer insertion no significant homology was found in the Gene Bank datebase. This insertion is the largest known rearrangement in the otherwise highly conservative surroundings of barley storage protein genes.
测定了大麦C-醇溶蛋白基因λCH4及其侧翼2820bp长度区域的核苷酸序列。该基因不含内含子,编码310个氨基酸的长多肽。成熟蛋白(291个氨基酸)推导的氨基酸序列中94%由重复的八肽基序PQQPEPQQ组成,该基序在独特的12个氨基酸长的NH2末端和独特的6个氨基酸长的COOH末端之间的整个肽链中重复出现。在5'非编码区有TATA盒、AGGA盒、CAAT盒和“胚乳”盒。3'非编码区有两个聚腺苷酸化信号。与已发表的C-醇溶蛋白序列相比,我们的基因包含一些插入、缺失和替换。在3'-非翻译区有两个长度分别为157bp和23bp的插入。在基因银行数据库中未发现较长插入序列有明显的同源性。该插入是大麦贮藏蛋白基因高度保守环境中已知的最大重排。