Musdal Yaman, Govindarajan Sridhar, Mannervik Bengt
Department of Neurochemistry, Arrhenius Laboratories, Stockholm University, Svante Arrhenius väg 16B, SE-10691 Stockholm, Sweden.
ATUM, 37950 Central Ct, Newark, CA 94560, USA.
Protein Eng Des Sel. 2017 Aug 1;30(8):543-549. doi: 10.1093/protein/gzx045.
Exploring the vicinity around a locus of a protein in sequence space may identify homologs with enhanced properties, which could become valuable in biotechnical and other applications. A rational approach to this pursuit is the use of 'infologs', i.e. synthetic sequences with specific substitutions capturing maximal sequence information derived from the evolutionary history of the protein family. Ninety-five such infolog genes of poplar glutathione transferase were synthesized and expressed in Escherichia coli, and the catalytic activities of the proteins determined with alternative substrates. Sequence-activity relationships derived from the infologs were used to design a second set of 47 infologs in which 90% of the members exceeded wild-type properties. Two mutants, C2 (V55I/E95D/D108E/A160V) and G5 (F13L/C70A/G122E), were further functionally characterized. The activities of the infologs with the alternative substrates 1-chloro-2,4-dinitrobenzene and phenethyl isothiocyanate, subject to different chemistries, were positively correlated, indicating that the examined mutations were affecting the overall catalytic competence without major shift in substrate discrimination. By contrast, the enhanced protein expressivity observed in many of the mutants were not similarly correlated with the activities. In conclusion, small libraries of well-defined infologs can be used to systematically explore sequence space to optimize proteins in multidimensional functional space.
在序列空间中探索蛋白质位点周围的区域,可能会识别出具有增强特性的同源物,这在生物技术和其他应用中可能会变得很有价值。实现这一目标的合理方法是使用“信息序列”,即具有特定取代的合成序列,这些取代捕获了源自蛋白质家族进化历史的最大序列信息。合成了95个杨树谷胱甘肽转移酶的此类信息序列基因,并在大肠杆菌中表达,并用替代底物测定了蛋白质的催化活性。从信息序列推导的序列-活性关系被用于设计第二组47个信息序列,其中90%的成员超过了野生型特性。对两个突变体C2(V55I/E95D/D108E/A160V)和G5(F13L/C70A/G122E)进行了进一步的功能表征。信息序列对具有不同化学性质的替代底物1-氯-2,4-二硝基苯和苯乙基异硫氰酸酯的活性呈正相关,这表明所检测的突变影响了整体催化能力,而底物选择性没有发生重大变化。相比之下,在许多突变体中观察到的增强蛋白表达能力与活性之间没有类似的相关性。总之,可以使用定义明确的信息序列小文库来系统地探索序列空间,以在多维功能空间中优化蛋白质。