Kuhlman B, Baker D
Department of Biochemistry and Howard Hughes Medical Institute, University of Washington School of Medicine, Seattle, WA 98195, USA.
Proc Natl Acad Sci U S A. 2000 Sep 12;97(19):10383-8. doi: 10.1073/pnas.97.19.10383.
How large is the volume of sequence space that is compatible with a given protein structure? Starting from random sequences, low free energy sequences were generated for 108 protein backbone structures by using a Monte Carlo optimization procedure and a free energy function based primarily on Lennard-Jones packing interactions and the Lazaridis-Karplus implicit solvation model. Remarkably, in the designed sequences 51% of the core residues and 27% of all residues were identical to the amino acids in the corresponding positions in the native sequences. The lowest free energy sequences obtained for ensembles of native-like backbone structures were also similar to the native sequence. Furthermore, both the individual residue frequencies and the covariances between pairs of positions observed in the very large SH3 domain family were recapitulated in core sequences designed for SH3 domain structures. Taken together, these results suggest that the volume of sequence space optimal for a protein structure is surprisingly restricted to a region around the native sequence.
与给定蛋白质结构兼容的序列空间体积有多大?通过使用蒙特卡罗优化程序和主要基于 Lennard-Jones 堆积相互作用以及 Lazaridis-Karplus 隐式溶剂化模型的自由能函数,从随机序列开始,为 108 种蛋白质主链结构生成了低自由能序列。值得注意的是,在设计的序列中,51% 的核心残基和 27% 的所有残基与天然序列中相应位置的氨基酸相同。为类天然主链结构集合获得 的最低自由能序列也与天然序列相似。此外,在非常大的 SH3 结构域家族中观察到的单个残基频率和位置对之间的协方差在为 SH3 结构域结构设计的核心序列中也得到了重现。综上所述,这些结果表明,对于一种蛋白质结构而言,最优的序列空间体积惊人地局限于天然序列周围的一个区域。