Finer M H, Aho S, Gerstenfeld L C, Boedtker H, Doty P
Department of Biochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts 02138.
J Biol Chem. 1987 Sep 25;262(27):13323-32.
Genomic clones corresponding to the amino-terminal propeptide and 5'-flanking sequences of the chicken pro-alpha 1(I) collagen gene were isolated as a first step in the identification of DNA sequences important for transcriptional regulation of the pro-alpha 1(I) collagen gene. Due to the failure to identify positive clones in either primary or amplified genomic libraries, a 5.1-kilobase pair StuI genomic fragment identified by Southern blotting was enriched by sucrose gradient fractionation of genomic DNA and cloned into lambda gt11. Comparison of the DNA sequence of the 5.1-kilobase pair StuI fragment to the DNA sequence of a cDNA clone encoding the amino-terminal propeptide, signal peptide, and the 5'-untranslated region identified the first four exons and most of the fifth. Exon size and intron position have been largely conserved between human and chicken alpha 1(I) genes. DNA sequence analysis of the region 5' to the transcription initiation site identified the canonical TATA and CAAT boxes. However, the 40-nucleotide pyrimidine stretch centered between -150 and -180 nucleotides, found in all previously isolated type I procollagen genes from chicken, mouse, and human, was absent in the chicken pro-alpha 1(I) collagen gene. This sequence corresponds to the in vivo DNase I hypersensitive site in the chicken pro-alpha 2(I) and mouse pro-alpha 1(I) collagen genes, as well as the in vitro S1 nuclease hypersensitive site in both chicken and mouse pro-alpha 2(I) collagen genes. Two unusual DNA sequences were identified within the chicken pro-alpha 1(I) collagen gene. Fifteen tandem repeats of the sequence GGGGAGA were identified within the first intron, 300 nucleotides 3' to the first exon. This sequence was identified due to its hypersensitivity to S1 nuclease in vitro in supercoiled plasmids. The second sequence located 5' to -180 contained at least 25 copies of a polymorphic, 23-base pair tandemly repeated sequence not identified in other type I procollagen genes. Both of these tandem repeat sequences were identified at other locations in the chicken genome by Southern blot hybridization.
作为鉴定对鸡原α1(I)胶原蛋白基因转录调控重要的DNA序列的第一步,分离了与鸡原α1(I)胶原蛋白基因氨基末端前肽和5'侧翼序列相对应的基因组克隆。由于在初级或扩增的基因组文库中均未能鉴定出阳性克隆,因此通过Southern印迹鉴定的5.1千碱基对的StuI基因组片段通过基因组DNA的蔗糖梯度分级分离进行富集,并克隆到λgt11中。将5.1千碱基对的StuI片段的DNA序列与编码氨基末端前肽、信号肽和5'非翻译区的cDNA克隆的DNA序列进行比较,确定了前四个外显子和第五个外显子的大部分。人源和鸡源α1(I)基因之间外显子大小和内含子位置在很大程度上是保守的。转录起始位点5'区域的DNA序列分析确定了典型的TATA盒和CAAT盒。然而,在鸡、小鼠和人的所有先前分离的I型前胶原基因中发现的位于-150至-180核苷酸之间的40个核苷酸的嘧啶延伸在鸡原α1(I)胶原蛋白基因中不存在。该序列对应于鸡原α2(I)和小鼠原α1(I)胶原蛋白基因中的体内DNase I超敏位点,以及鸡和小鼠原α2(I)胶原蛋白基因中的体外S1核酸酶超敏位点。在鸡原α1(I)胶原蛋白基因中鉴定出两个不寻常的DNA序列。在第一个内含子中,第一个外显子下游300个核苷酸处鉴定出15个串联重复序列GGGGAGA。该序列因其在体外超螺旋质粒中对S1核酸酶的超敏性而被鉴定出来。位于-180上游5'的第二个序列包含至少25个多态性的23碱基对串联重复序列拷贝,在其他I型前胶原基因中未发现。通过Southern印迹杂交在鸡基因组的其他位置鉴定出了这两个串联重复序列。