Desseyn J L, Aubert J P, Van Seuningen I, Porchet N, Laine A
Unité 377 INSERM, Place de Verdun, 59045 Lille Cedex, France.
J Biol Chem. 1997 Jul 4;272(27):16873-83. doi: 10.1074/jbc.272.27.16873.
MUC5B, mapped clustered with MUC6, MUC2, and MUC5AC to chromosome 11p15.5, is a human mucin gene of which the genomic organization is being elucidated. We have recently published the sequence and the peptide organization of its huge central exon, 10,713 base pairs (bp) in length. We present here the genomic organization of its 3' region, which encompasses 10,690 bp. The genomic sequence has been completely determined. The 3' region of MUC5B is composed of 18 exons ranging in size from 32 to 781 bp, contrasting thus with the very large central exon. The sizes of the 18 introns range from 114 to 1118 bp. Some repetitive sequences were identified in four introns. The peptide deduced from the sequence of the 18 exons consists of an 808-amino acid peptide. This carboxyl-terminal region exhibits extensive sequence similarity to MUC2, MUC5AC, and von Willebrand factor, particularly the number and the positions of the cysteine residues, suggesting that this domain may be derived from a common ancestral gene. The presence in these components of a cystine knot also found in growth factors such as transforming growth factor-beta is of particular interest. Moreover, one part of this peptide is identical to the 196-amino acid sequence deduced from the cDNA clone pSM2-1, which codes for a part of the high molecular weight mucin MG1 isolated from human sublingual gland. Considering the expression pattern of MUC5B and the origin of MG1, we can thus conclude that MUC5B encodes MG1.
MUC5B基因定位于11号染色体p15.5,与MUC6、MUC2和MUC5AC基因聚集在一起,是一个人类黏蛋白基因,其基因组结构正在被阐明。我们最近发表了其巨大的中央外显子的序列和肽段结构,该外显子长度为10713个碱基对(bp)。我们在此展示其3'区域的基因组结构,该区域包含10690 bp。基因组序列已完全确定。MUC5B的3'区域由18个外显子组成,大小从32 bp到781 bp不等,与非常大的中央外显子形成对比。18个内含子的大小从114 bp到1118 bp不等。在四个内含子中鉴定出了一些重复序列。从18个外显子序列推导的肽段由一个808个氨基酸的肽组成。这个羧基末端区域与MUC2、MUC5AC和血管性血友病因子表现出广泛的序列相似性,特别是半胱氨酸残基的数量和位置,表明该结构域可能源自一个共同的祖先基因。在这些成分中存在的胱氨酸结也见于如转化生长因子-β等生长因子中,这一点特别令人感兴趣。此外,该肽段的一部分与从cDNA克隆pSM2-1推导的196个氨基酸序列相同,该克隆编码从人舌下腺分离的高分子量黏蛋白MG1的一部分。考虑到MUC5B的表达模式和MG1的来源,我们可以得出结论,MUC5B编码MG1。