Verma Abhinav, Wenzel Wolfgang
Institute for Scientific Computing, Forschungszentrum Karlsruhe, Karlsruhe, Germany.
BMC Struct Biol. 2007 Mar 19;7:12. doi: 10.1186/1472-6807-7-12.
The reliable prediction of protein tertiary structure from the amino acid sequence remains challenging even for small proteins. We have developed an all-atom free-energy protein forcefield (PFF01) that we could use to fold several small proteins from completely extended conformations. Because the computational cost of de-novo folding studies rises steeply with system size, this approach is unsuitable for structure prediction purposes. We therefore investigate here a low-cost free-energy relaxation protocol for protein structure prediction that combines heuristic methods for model generation with all-atom free-energy relaxation in PFF01.
We use PFF01 to rank and cluster the conformations for 32 proteins generated by ROSETTA. For 22/10 high-quality/low quality decoy sets we select near-native conformations with an average Calpha root mean square deviation of 3.03 A/6.04 A. The protocol incorporates an inherent reliability indicator that succeeds for 78% of the decoy sets. In over 90% of these cases near-native conformations are selected from the decoy set. This success rate is rationalized by the quality of the decoys and the selectivity of the PFF01 forcefield, which ranks near-native conformations an average 3.06 standard deviations below that of the relaxed decoys (Z-score).
All-atom free-energy relaxation with PFF01 emerges as a powerful low-cost approach toward generic de-novo protein structure prediction. The approach can be applied to large all-atom decoy sets of any origin and requires no preexisting structural information to identify the native conformation. The study provides evidence that a large class of proteins may be foldable by PFF01.
即使对于小蛋白质而言,从氨基酸序列可靠预测蛋白质三级结构仍然具有挑战性。我们开发了一种全原子自由能蛋白质力场(PFF01),可用于将几种小蛋白质从完全伸展的构象折叠起来。由于从头折叠研究的计算成本会随着系统大小急剧增加,因此这种方法不适用于结构预测目的。因此,我们在此研究一种用于蛋白质结构预测的低成本自由能松弛方案,该方案将用于模型生成的启发式方法与PFF01中的全原子自由能松弛相结合。
我们使用PFF01对由ROSETTA生成的32种蛋白质的构象进行排序和聚类。对于22/10个高质量/低质量诱饵集,我们选择了平均Cα均方根偏差为3.03 Å/6.04 Å的近天然构象。该方案包含一个固有的可靠性指标,在78%的诱饵集中成功。在超过90%的这些情况下,从诱饵集中选择了近天然构象。诱饵的质量和PFF01力场的选择性使这种成功率合理化,PFF01力场将近天然构象的排名平均比松弛诱饵低3.06个标准差(Z分数)。
使用PFF01进行全原子自由能松弛成为一种强大的低成本通用从头蛋白质结构预测方法。该方法可应用于任何来源的大型全原子诱饵集,并且无需预先存在的结构信息来识别天然构象。该研究提供了证据表明一大类蛋白质可能可被PFF01折叠。