Gerngross U T, Romaniec M P, Kobayashi T, Huskisson N S, Demain A L
Department of Biology, Massachusetts Institute of Technology, Cambridge 02139.
Mol Microbiol. 1993 Apr;8(2):325-34. doi: 10.1111/j.1365-2958.1993.tb01576.x.
It is known that two proteins of the cellulosomal complex of Clostridium thermocellum (SL and SS) together degrade crystalline cellulose. SL is a glycoprotein of 210,000 Da which enhances the binding to cellulose and the activity of SS, an endoglucanase of 83,000 Da. We have previously reported the cloning of a DNA fragment encoding the N-terminal end of the SL protein using antibodies raised against the native protein. A chromosomal walking approach using an EcoRI and a Bam HI-Sau3A gene library allowed us to isolate the C-terminal end of the gene. Sequencing of both fragments revealed the existence of a leader peptide as has been found in cellulases of the same organism. This leader sequence is followed by a stretch of 14 amino acids that is identical to the N-terminal amino acid sequence of the native secreted protein. The open reading frame (ORF) of this gene encodes a protein of 196,800 Da and is followed by a hairpin loop that could be involved in transcription termination. Within the open reading frame (ORF), we found nine internal repeated elements (IREs) of about 500 nucleotides each. Seven of these sequences displayed 98-100% homology and were located adjacent to each other within the structural gene without intervening regions. The remaining two, located on the N-terminal end of the gene, showed a significantly lower homology. Bearing in mind the inherent instability of reiterated regions, we confirmed the authenticity of our clones by Southern blot analysis using chromosomal C. thermocellum DNA and ruled out the possibility of rearrangements during the cloning and sequencing process. The sequenced gene is designated cipA and the encoded SL protein CipA.
已知嗜热栖热梭菌纤维小体复合物中的两种蛋白质(SL和SS)共同降解结晶纤维素。SL是一种210,000 Da的糖蛋白,可增强与纤维素的结合以及SS(一种83,000 Da的内切葡聚糖酶)的活性。我们之前报道过,利用针对天然蛋白质产生的抗体克隆了编码SL蛋白N末端的DNA片段。使用EcoRI和Bam HI-Sau3A基因文库的染色体步移方法使我们能够分离该基因的C末端。对两个片段的测序揭示了如在同一生物体的纤维素酶中所发现的前导肽的存在。该前导序列之后是一段14个氨基酸的序列,与天然分泌蛋白的N末端氨基酸序列相同。该基因的开放阅读框(ORF)编码一种196,800 Da的蛋白质,其后是一个可能参与转录终止的发夹环。在开放阅读框(ORF)内,我们发现了九个内部重复元件(IREs),每个约500个核苷酸。其中七个序列显示出98 - 100%的同源性,并且在结构基因内彼此相邻定位,没有间隔区域。其余两个位于基因的N末端,显示出明显较低的同源性。考虑到重复区域固有的不稳定性,我们通过使用嗜热栖热梭菌染色体DNA的Southern印迹分析确认了我们克隆的真实性,并排除了在克隆和测序过程中重排的可能性。测序的基因被命名为cipA,编码的SL蛋白为CipA。