Soininen R, Huotari M, Ganguly A, Prockop D J, Tryggvason K
Department of Biochemistry, University of Oulu, Finland.
J Biol Chem. 1989 Aug 15;264(23):13565-71.
The complete exon size and distribution pattern in the gene for the alpha 1 chain of human type IV collagen was determined. Clones covering 145 kilobases (kb) of genomic DNA including 100 kb of the gene itself as well as 25 kb upstream and 20 kb downstream of the gene sequences, respectively, were isolated from lambda phage and cosmid libraries. The overall gene structure was determined by endonuclease restriction mapping and R-loop analyses and all exon sizes by nucleotide sequencing. The characterized clones contained all the coding sequences except for exon 2 whose sequence was determined after its amplification by the polymerase chain reaction. There were four gaps in the intron sequences; the exact size of the gene is unknown. The entire gene is at least 100 kb in size and contains 52 exons whose size distribution is completely different from that of the genes for fibrillar collagens. In the -Gly-X-Y- coding region there are three exons of 99, 90, and 45 base pairs (bp) each and two exons of 27, 36, 42, 51, 54, 63, and 84 bp each. The rest of the exons have sizes between 71 and 192 bp in the collagenous region. About one-half of the -Gly-X-Y- repeat coding exons start with the second base for the codon of glycine, whereas the other half starts (with two exceptions) with a complete glycine codon. The distribution of split versus unsplit codons is uneven in that the first 19 exons of the gene start with a complete codon. The gene contains repetitive sequences in several regions. A 185-nucleotide segment containing 40 copies of CCT flanked by poly(C) and poly(T) sequences was shown to be located adjacent to an exon. The gene has previously been shown to be located head-to-head to the alpha 2(IV) collagen gene at the distal end of the long arm of chromosome 13, such that the first exons of the two genes are separated by as little as 42 bp (Pöschl, E., Pollner, R., and Kühn, K. (1988) EMBOJ. 7,2687-2695; Soininen, R., Huotari, M., Hostikka, S. L., Prockop, D. J., and Tryggvason, K. (1988) J. Biol. Chem. 263, 17217-17220). The results demonstrate that the human alpha 1(IV) collagen gene has a structure distinctly different from the genes for fibrillar collagens and also that it is considerably larger than any collagen gene characterized to date.
确定了人IV型胶原α1链基因中的完整外显子大小和分布模式。从λ噬菌体和黏粒文库中分离出覆盖145千碱基(kb)基因组DNA的克隆,其中包括100 kb的基因本身以及基因序列上游25 kb和下游20 kb。通过核酸内切酶限制图谱和R环分析确定了整体基因结构,并通过核苷酸测序确定了所有外显子大小。所鉴定的克隆包含除外显子2之外的所有编码序列,外显子2的序列是在通过聚合酶链反应扩增后确定的。内含子序列中有四个缺口;基因的确切大小未知。整个基因大小至少为100 kb,包含52个外显子,其大小分布与纤维状胶原基因的完全不同。在-Gly-X-Y-编码区域,有三个分别为99、90和45个碱基对(bp)的外显子以及两个分别为27、36、42、51、54、63和84 bp的外显子。其余外显子在胶原区域的大小在71至192 bp之间。约一半的-Gly-X-Y-重复编码外显子以甘氨酸密码子的第二个碱基开始,而另一半(有两个例外)以完整的甘氨酸密码子开始。分裂密码子与非分裂密码子的分布不均匀,因为该基因的前19个外显子以完整密码子开始。该基因在几个区域含有重复序列。一个包含40个CCT拷贝且两侧为聚(C)和聚(T)序列的185个核苷酸片段被证明位于一个外显子附近。先前已证明该基因在13号染色体长臂末端与α2(IV)胶原基因头对头排列,使得两个基因的第一个外显子仅相隔42 bp(Pöschl,E.,Pollner,R.,和Kühn,K.(1988)EMBO J. 7,2687 - 2695;Soininen,R.,Huotari,M.,Hostikka,S. L.,Prockop,D. J.,和Tryggvason,K.(1988)J. Biol. Chem. 263,17217 - 17220)。结果表明,人α1(IV)胶原基因具有与纤维状胶原基因明显不同的结构,并且它也比迄今为止所鉴定的任何胶原基因都大得多。