Schwarz W H, Schimming S, Rücknagel K P, Burgschwaiger S, Kreil G, Staudenbauer W L
Institute for Microbiology, Technical University of Munich, F.R.G.
Gene. 1988;63(1):23-30. doi: 10.1016/0378-1119(88)90542-2.
The nucleotide sequence of the cellulase gene celC, encoding endoglucanase C of Clostridium thermocellum, has been determined. The coding region of 1032 bp was identified by comparison with the N-terminal amino acid (aa) sequence of endoglucanase C purified from Escherichia coli. The ATG start codon is preceded by an AGGAGG sequence typical of ribosome-binding sites in Gram-positive bacteria. The derived amino acid sequence corresponds to a protein of Mr 40,439. Amino acid analysis and apparent Mr of endoglucanase C are consistent with the amino acid sequence as derived from the DNA sequencing data. A proposed N-terminal 21-aa residue leader (signal) sequence differs from other prokaryotic signal peptides and is non-functional in E. coli. Most of the protein bears no resemblance to the endoglucanases A, B, and D of the same organism. However, a short region of homology between endoglucanases A and C was identified, which is similar to the established active sites of lysozymes and to related sequences of fungal cellulases.