Bhargava A K, Woitach J T, Davidson E A, Bhavanandan V P
Department of Biological Chemistry, Pennsylvania State University, Hershey, PA 17033.
Proc Natl Acad Sci U S A. 1990 Sep;87(17):6798-802. doi: 10.1073/pnas.87.17.6798.
A lambda gt11 cDNA library prepared from bovine submaxillary gland mRNA was screened with polyclonal anti-apo-bovine submaxillary mucin antibodies with the aim of obtaining the deduced amino acid sequence of the mucin core protein. One of the positive clones had a 1.8 kilobase (kb) cDNA insert and coded for an incomplete protein. A 2.0-kb cDNA clone was isolated by rescreening the library with the 1.8-kb cDNA. Nucleotide sequencing of the full-length 2.0-kb cDNA revealed an open reading frame that coded for a 563-amino acid protein. A striking feature of the cloned protein is the skewed distribution of the amino acids, most notably that of the hydroxy amino acids and cysteine. The amino-terminal domain of 339 residues is very rich in threonine, serine, and glycine and poor in cysteine, aspartic acid, tyrosine, phenylalanine, and tryptophan. In contrast, the carboxyl-terminal domain of 224 residues is rich in cysteine, aspartic acid, tyrosine, lysine, and asparagine and relatively poor in threonine, serine, and glycine. A search of the protein data bank for homologies to the deduced amino acid sequence revealed statistically significant matches to several proteins, including the porcine submaxillary apomucin fragment. The cysteine-rich domain by itself was not statistically homologous with any of the registered polypeptide sequences. RNA blot analysis using DNA probes corresponding to the mucin-like and cysteine-rich regions detected a nearly identical pattern of transcripts, demonstrating that the characterized clones are not artifacts of cDNA library construction. The blots also showed the presence of polydisperse transcripts in bovine submaxillary gland but no detectable hybridization signals in liver or brain RNA.
为了获得粘蛋白核心蛋白的推导氨基酸序列,用多克隆抗载脂蛋白 - 牛下颌下粘蛋白抗体筛选了从牛下颌下腺mRNA制备的λgt11 cDNA文库。其中一个阳性克隆有一个1.8千碱基(kb)的cDNA插入片段,编码一种不完整的蛋白质。通过用1.8 - kb cDNA重新筛选文库,分离出一个2.0 - kb的cDNA克隆。对全长2.0 - kb cDNA进行核苷酸测序,发现一个编码563个氨基酸蛋白质的开放阅读框。克隆蛋白的一个显著特征是氨基酸的偏态分布,最明显的是羟基氨基酸和半胱氨酸的分布。339个残基的氨基末端结构域富含苏氨酸、丝氨酸和甘氨酸,而半胱氨酸、天冬氨酸、酪氨酸、苯丙氨酸和色氨酸含量较低。相比之下,224个残基的羧基末端结构域富含半胱氨酸、天冬氨酸、酪氨酸、赖氨酸和天冬酰胺,而苏氨酸、丝氨酸和甘氨酸含量相对较低。在蛋白质数据库中搜索与推导氨基酸序列的同源性,发现与几种蛋白质有统计学上显著的匹配,包括猪下颌下脱辅基粘蛋白片段。富含半胱氨酸的结构域本身与任何已注册的多肽序列在统计学上都没有同源性。使用与粘蛋白样和富含半胱氨酸区域相对应的DNA探针进行RNA印迹分析,检测到几乎相同的转录本模式,表明所鉴定的克隆不是cDNA文库构建的假象。印迹还显示牛下颌下腺中存在多分散转录本,但在肝脏或脑RNA中未检测到杂交信号。