Institute of Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, and Department of Integrative Biology, The University of Texas at Austin , Austin, TX , USA.
PeerJ. 2013 Nov 12;1:e211. doi: 10.7717/peerj.211. eCollection 2013.
Computational protein design attempts to create protein sequences that fold stably into pre-specified structures. Here we compare alignments of designed proteins to alignments of natural proteins and assess how closely designed sequences recapitulate patterns of sequence variation found in natural protein sequences. We design proteins using RosettaDesign, and we evaluate both fixed-backbone designs and variable-backbone designs with different amounts of backbone flexibility. We find that proteins designed with a fixed backbone tend to underestimate the amount of site variability observed in natural proteins while proteins designed with an intermediate amount of backbone flexibility result in more realistic site variability. Further, the correlation between solvent exposure and site variability in designed proteins is lower than that in natural proteins. This finding suggests that site variability is too uniform across different solvent exposure states (i.e., buried residues are too variable or exposed residues too conserved). When comparing the amino acid frequencies in the designed proteins with those in natural proteins we find that in the designed proteins hydrophobic residues are underrepresented in the core. From these results we conclude that intermediate backbone flexibility during design results in more accurate protein design and that either scoring functions or backbone sampling methods require further improvement to accurately replicate structural constraints on site variability.
计算蛋白质设计试图创造出能够稳定折叠成预定结构的蛋白质序列。在这里,我们将设计的蛋白质的比对与天然蛋白质的比对进行比较,并评估设计序列在多大程度上再现了天然蛋白质序列中发现的序列变异模式。我们使用 RosettaDesign 进行蛋白质设计,并评估具有不同程度骨架灵活性的固定骨架设计和可变骨架设计。我们发现,具有固定骨架的蛋白质设计往往低估了天然蛋白质中观察到的位点可变性的数量,而具有中等骨架灵活性的蛋白质设计则导致更现实的位点可变性。此外,设计蛋白质中溶剂暴露和位点可变性之间的相关性低于天然蛋白质。这一发现表明,位点可变性在不同溶剂暴露状态下过于均匀(即,埋藏残基变化太大或暴露残基太保守)。当将设计蛋白质中的氨基酸频率与天然蛋白质中的氨基酸频率进行比较时,我们发现疏水残基在核心中代表性不足。从这些结果中,我们得出结论,设计过程中的中等骨架灵活性导致更准确的蛋白质设计,并且评分函数或骨架采样方法需要进一步改进,以准确复制对位点可变性的结构约束。