Burke Sean, Elber Ron
Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas 78712.
Proteins. 2012 Feb;80(2):463-70. doi: 10.1002/prot.23212. Epub 2011 Nov 17.
Exhaustive enumeration of sequences and folds is conducted for a simple lattice model of conformations, sequences, and energies. Examination of all foldable sequences and their nearest connected neighbors (sequences that differ by no more than a point mutation) illustrates the following: (i) There exist unusually large number of sequences that fold into a few structures (super-folds). The same observation was made experimentally and computationally using stochastic sampling and exhaustive enumeration of related models. (ii) There exist only a few large networks of connected sequences that are not restricted to one fold. These networks cover a significant fraction of fold spaces (super-networks). (iii) There exist barriers in sequence space that prevent foldable sequences of the same structure to "connect" through a series of single point mutations (super-barrier), even in the presence of the sequence connection between folds. While there is ample experimental evidence for the existence of super-folds, evidence for a super-network is just starting to emerge. The prediction of a sequence barrier is an intriguing characteristic of sequence space, suggesting that the overall sequence space may be disconnected. The implications and limitations of these observations for evolution of protein structures are discussed.
针对构象、序列和能量的简单晶格模型,对序列和折叠进行了穷举。对所有可折叠序列及其最近的连接邻居(相差不超过一个点突变的序列)的研究表明:(i)存在大量折叠成少数几种结构(超级折叠)的序列。使用随机抽样和相关模型的穷举,通过实验和计算也得出了相同的观察结果。(ii)仅存在少数几个不限于一种折叠的连接序列大网络。这些网络覆盖了折叠空间的很大一部分(超级网络)。(iii)即使在折叠之间存在序列连接的情况下,序列空间中也存在障碍,阻止相同结构的可折叠序列通过一系列单点突变“连接”(超级障碍)。虽然有大量实验证据证明超级折叠的存在,但超级网络的证据才刚刚开始出现。序列障碍的预测是序列空间的一个有趣特征,表明整个序列空间可能是不连续的。本文讨论了这些观察结果对蛋白质结构进化的影响和局限性。