Gage L P, Manning R F
J Biol Chem. 1980 Oct 10;255(19):9444-50.
The DNA sequence orgainzation of the protein encoding region of the gene for silk fibroin has been analyzed. The accompanying paper (Manningm R. F., and Gage, L. P. (1980) J. Biol. Chem. 255, 9451-9457) shows that the total length of the gene, and its protein, as well as the pattern of restriction sites in the gene is highly polymorphic among inbred stocks of Bombyx mori, In this paper, those features of fibroin gene structure which are invariant among these alleles are presented. Fibroin is composed primarily of relatively short "crystalline" and "amorphous" peptides of known sequence whose arrangement in the protein is unknown. Knowledge of the codons most commonly used in fibroin mRNA allowed utilization of particular restriction inzymes as a means for determing the nature and organization of crystalline and amorphous coding sequences in the fibroin gene. Three restriction endonucleases were identified that cleve sequences coding for amorphous region peptides. Their cleavage pattern revelaed that the repetitive coding sequence of the gene core (approximately 15 kilobases) is divided into at least 10 large crystalline coding domains interrupted by smaller amorphous coding domains. Many restriction endoncleases do not cleave the fibroin core at all, three of them with four gase recognition sequences. Specific deductions as to codon usage and repetitive sequence homogeneity in the gene follow from these results. One novel finding is the rigorous exclusion of the glycine codon GGA prior to serine codons even though this glycine codon is used frequently prior to alanine codons. The sequence homogeneity and the regularly alternating arrangement of crystalline and amorphous coding sequences of the gene are discussed in terms of the function of fibroin protein and the evolution of highly repetitive DNA.
对丝心蛋白基因编码区的DNA序列组织进行了分析。随附论文(Manningm R. F.和Gage, L. P.(1980年)《生物化学杂志》255卷,9451 - 9457页)表明,该基因的全长及其蛋白质,以及基因中的限制性酶切位点模式在家蚕近交品系中具有高度多态性。在本文中,呈现了丝心蛋白基因结构在这些等位基因中不变的那些特征。丝心蛋白主要由已知序列的相对较短的“结晶”和“无定形”肽组成,其在蛋白质中的排列尚不清楚。了解丝心蛋白mRNA中最常用的密码子使得能够利用特定的限制性内切酶来确定丝心蛋白基因中结晶和无定形编码序列的性质和组织。鉴定出三种限制性内切酶,它们切割编码无定形区域肽的序列。它们的切割模式表明,基因核心(约15千碱基)的重复编码序列被至少10个大的结晶编码结构域打断,这些结构域被较小的无定形编码结构域隔开。许多限制性内切酶根本不切割丝心蛋白核心,其中三种具有四个碱基识别序列。从这些结果可以得出关于基因中密码子使用和重复序列同质性的具体推论。一个新发现是,在丝氨酸密码子之前严格排除甘氨酸密码子GGA,尽管该甘氨酸密码子在丙氨酸密码子之前经常被使用。根据丝心蛋白的功能和高度重复DNA的进化,讨论了基因的序列同质性以及结晶和无定形编码序列的规则交替排列。