Sandell L J, Prentice H L, Kravis D, Upholt W B
J Biol Chem. 1984 Jun 25;259(12):7826-34.
The DNA sequence of two overlapping cDNA clones and a genomic lambda clone covering the region coding for 288 amino acids at the COOH terminus of the chicken type II procollagen gene is reported. This region consists of 4 exons coding for the last 15 amino acids of the triple helical domain and 273 amino acids which correspond to the COOH-terminal telopeptide and COOH-terminal propeptide. The sequence, base composition, and codon usage of this region of the type II procollagen gene show particularly high similarity to those of the chicken alpha 1(I) procollagen gene and differ from those of the alpha 2(I) and alpha 1(III) gene sequences. Two DNA tracts of low sequence similarity were observed. One of these regions spans the telopeptide and COOH-terminal propeptidase cleavage site, although 4-5 amino acids at the actual cleavage site are conserved compared with the alpha 1(I) and alpha 2(I) genes. A region of unusually high nucleotide sequence conservation is present in exon 2 (amino acids 171c - 186c ) consisting of approximately 45 nucleotides with only one or two base substitutions compared with the other procollagen genes. Within this conserved sequence is a site for carbohydrate attachment. The 3' nontranslated sequence of the type II procollagen mRNA is longer than that of either the alpha 1(I) or alpha 2(I) mRNA and contains several unusual long tracts consisting primarily of one or two bases. Although the canonical site for polyadenylation is not present, two related sequences, AACAAA and ATATAAA , are present 32 and 41 bases preceding the end of the major RNA species. The exon/intron structure of the type II procollagen gene is similar to that of other collagen genes which have been described. This DNA sequence provides the first extensive report of the amino acid sequence of chicken type II procollagen.
报道了两个重叠的cDNA克隆和一个基因组λ克隆的DNA序列,该基因组λ克隆覆盖鸡II型前胶原基因COOH末端编码288个氨基酸的区域。该区域由4个外显子组成,编码三螺旋结构域的最后15个氨基酸以及对应于COOH末端肽和COOH末端前肽的273个氨基酸。II型前胶原基因该区域的序列、碱基组成和密码子使用情况与鸡α1(I)前胶原基因的序列、碱基组成和密码子使用情况具有特别高的相似性,与α2(I)和α1(III)基因序列不同。观察到两个低序列相似性的DNA片段。其中一个区域跨越肽末端和COOH末端前肽酶切割位点,尽管与α1(I)和α2(I)基因相比,实际切割位点的4-5个氨基酸是保守的。外显子2(氨基酸171c - 186c)中存在一个异常高的核苷酸序列保守区域,该区域由大约45个核苷酸组成,与其他前胶原基因相比只有一两个碱基替换。在这个保守序列内有一个碳水化合物附着位点。II型前胶原mRNA的3'非翻译序列比α1(I)或α2(I) mRNA的3'非翻译序列长,并且包含几个主要由一两个碱基组成的异常长的片段。尽管不存在多聚腺苷酸化的典型位点,但在主要RNA种类末端之前32和41个碱基处存在两个相关序列AACAAA和ATATAAA。II型前胶原基因的外显子/内含子结构与已描述的其他胶原基因的外显子/内含子结构相似。该DNA序列首次广泛报道了鸡II型前胶原的氨基酸序列。