Institut de Biochimie Moléculaire et Cellulaire (IBBMC), Univ Paris Sud, UMR 8619, Orsay, F-91405 Orsay, France.
J Mol Biol. 2010 Nov 26;404(2):307-27. doi: 10.1016/j.jmb.2010.09.048. Epub 2010 Sep 29.
Repeat proteins have a modular organization and a regular architecture that make them attractive models for design and directed evolution experiments. HEAT repeat proteins, although very common, have not been used as a scaffold for artificial proteins, probably because they are made of long and irregular repeats. Here, we present and validate a consensus sequence for artificial HEAT repeat proteins. The sequence was defined from the structure-based sequence analysis of a thermostable HEAT-like repeat protein. Appropriate sequences were identified for the N- and C-caps. A library of genes coding for artificial proteins based on this sequence design, named αRep, was assembled using new and versatile methodology based on circular amplification. Proteins picked randomly from this library are expressed as soluble proteins. The biophysical properties of proteins with different numbers of repeats and different combinations of side chains in hypervariable positions were characterized. Circular dichroism and differential scanning calorimetry experiments showed that all these proteins are folded cooperatively and are very stable (T(m) >70 °C). Stability of these proteins increases with the number of repeats. Detailed gel filtration and small-angle X-ray scattering studies showed that the purified proteins form either monomers or dimers. The X-ray structure of a stable dimeric variant structure was solved. The protein is folded with a highly regular topology and the repeat structure is organized, as expected, as pairs of alpha helices. In this protein variant, the dimerization interface results directly from the variable surface enriched in aromatic residues located in the randomized positions of the repeats. The dimer was crystallized both in an apo and in a PEG-bound form, revealing a very well defined binding crevice and some structure flexibility at the interface. This fortuitous binding site could later prove to be a useful binding site for other low molecular mass partners.
重复蛋白具有模块化的结构和规则的架构,这使得它们成为设计和定向进化实验的理想模型。尽管 HEAT 重复蛋白非常常见,但它们尚未被用作人工蛋白的支架,可能是因为它们由长而不规则的重复组成。在这里,我们提出并验证了人工 HEAT 重复蛋白的一致序列。该序列是根据耐热型 HEAT 样重复蛋白的基于结构的序列分析定义的。适用于 N-和 C-帽的序列被识别。使用基于循环扩增的新的多功能方法,组装了一个基于该序列设计的编码人工蛋白的基因文库,命名为αRep。从该文库中随机挑选的蛋白质作为可溶性蛋白质表达。对具有不同重复数和超变位置侧链不同组合的蛋白质的生物物理性质进行了表征。圆二色性和差示扫描量热法实验表明,所有这些蛋白质都是协同折叠的,非常稳定(Tm>70°C)。这些蛋白质的稳定性随重复数的增加而增加。详细的凝胶过滤和小角 X 射线散射研究表明,纯化的蛋白质形成单体或二聚体。稳定二聚变体结构的 X 射线结构已被解决。该蛋白具有高度规则的拓扑结构,重复结构如预期的那样组织成双螺旋结构。在该蛋白变体中,二聚化界面直接来自富含芳香族残基的可变表面,这些残基位于重复的随机位置。该二聚体以无配体和 PEG 结合两种形式结晶,揭示了非常明确的结合裂缝和界面处的一些结构灵活性。这个偶然的结合位点以后可能会成为其他低分子量配体的有用结合位点。