Owens G C, Edelman G M, Cunningham B A
Proc Natl Acad Sci U S A. 1987 Jan;84(1):294-8. doi: 10.1073/pnas.84.1.294.
The neural cell adhesion molecule, N-CAM, is expressed as at least three polypeptide chain, (ld, sd, and ssd chains) specified by a single gene and derived by alternative splicing and polyadenylation-site selection during RNA processing. We describe here the characterization of seven overlapping genomic phage clones reactive with N-CAM cDNA, indicating that the chicken N-CAM gene is more than 50 kilobases long. Analysis of the gene shows that there are at least 19 exons and that the coding sequences for the ld, sd, and ssd chains are assembled from 18, 17, and 15 exons, respectively. The first 14 exons appear to be common to all three chains and encode the amino-terminal portion of N-CAM, which contains five tandem homologous repeats resembling those seen in the immunoglobulin gene superfamily. In contrast to other genes containing such domains, each of these segments in N-CAM is specified by two exons. The carboxyl-terminal portion of each N-CAM chain is different as a result of the alternative use of exons. A single exon encodes the carboxyl-terminal 26 amino acids of the ssd chain and the 3' untranslated region of its mRNA, ending with a poly(A)-addition site. Two exons encode the transmembrane and cytoplasmic sequences common to the ld and sd chains, and another exon encodes the additional 261 amino acids found in the cytoplasmic domain of the ld chain. The carboxyl-terminal 21 amino acids common to the ld and sd chains and the 3' untranslated region common to their mRNAs are encoded by a single large exon of 3475 base pairs that ends with a second poly(A)-addition site. Sequences from the 13-kilobase intron that separates the exons encoding the amino-terminal and carboxyl-terminal regions of the molecule hybridize to a 2-kilobase poly(A)+ RNA transcript of unknown identity. This description of the chicken N-CAM gene provides a basis for determining the mechanisms that regulate the differential expression of the N-CAM polypeptide chains during development.
神经细胞黏附分子N-CAM至少以三种多肽链(ld链、sd链和ssd链)的形式表达,这些多肽链由单个基因指定,通过RNA加工过程中的可变剪接和聚腺苷酸化位点选择产生。我们在此描述了七个与N-CAM cDNA反应的重叠基因组噬菌体克隆的特征,表明鸡N-CAM基因长度超过50千碱基。对该基因的分析表明,它至少有19个外显子,并且ld链、sd链和ssd链的编码序列分别由18个、17个和15个外显子组装而成。前14个外显子似乎是所有三条链共有的,编码N-CAM的氨基末端部分,该部分包含五个串联的同源重复序列,类似于免疫球蛋白基因超家族中的重复序列。与其他含有此类结构域的基因不同,N-CAM中的每个片段由两个外显子指定。由于外显子的选择性使用,每个N-CAM链的羧基末端部分有所不同。一个外显子编码ssd链的羧基末端26个氨基酸及其mRNA的3'非翻译区,并以一个聚腺苷酸添加位点结束。两个外显子编码ld链和sd链共有的跨膜和细胞质序列,另一个外显子编码ld链细胞质结构域中额外的261个氨基酸。ld链和sd链共有的羧基末端21个氨基酸及其mRNA共有的3'非翻译区由一个3475个碱基对的大外显子编码,该外显子以第二个聚腺苷酸添加位点结束。分隔分子氨基末端和羧基末端编码外显子的13千碱基内含子的序列与一个身份不明的2千碱基聚腺苷酸加RNA转录本杂交。对鸡N-CAM基因的这种描述为确定在发育过程中调节N-CAM多肽链差异表达的机制提供了基础。