de Bakker Paul I W, DePristo Mark A, Burke David F, Blundell Tom L
Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom.
Proteins. 2003 Apr 1;51(1):21-40. doi: 10.1002/prot.10235.
The accuracy of model selection from decoy ensembles of protein loop conformations was explored by comparing the performance of the Samudrala-Moult all-atom statistical potential (RAPDF) and the AMBER molecular mechanics force field, including the Generalized Born/surface area solvation model. Large ensembles of consistent loop conformations, represented at atomic detail with idealized geometry, were generated for a large test set of protein loops of 2 to 12 residues long by a novel ab initio method called RAPPER that relies on fine-grained residue-specific phi/psi propensity tables for conformational sampling. Ranking the conformers on the basis of RAPDF scores resulted in selected conformers that had an average global, non-superimposed RMSD for all heavy mainchain atoms ranging from 1.2 A for 4-mers to 2.9 A for 8-mers to 6.2 A for 12-mers. After filtering on the basis of anchor geometry and RAPDF scores, ranking by energy minimization of the AMBER/GBSA potential energy function selected conformers that had global RMSD values of 0.5 A for 4-mers, 2.3 A for 8-mers, and 5.0 A for 12-mers. Minimized fragments had, on average, consistently lower RMSD values (by 0.1 A) than their initial conformations. The importance of the Generalized Born solvation energy term is reflected by the observation that the average RMSD accuracy for all loop lengths was worse when this term is omitted. There are, however, still many cases where the AMBER gas-phase minimization selected conformers of lower RMSD than the AMBER/GBSA minimization. The AMBER/GBSA energy function had better correlation with RMSD to native than the RAPDF. When the ensembles were supplemented with conformations extracted from experimental structures, a dramatic improvement in selection accuracy was observed at longer lengths (average RMSD of 1.3 A for 8-mers) when scoring with the AMBER/GBSA force field. This work provides the basis for a promising hybrid approach of ab initio and knowledge-based methods for loop modeling.
通过比较萨穆德雷拉 - 莫尔特全原子统计势(RAPDF)和AMBER分子力学力场(包括广义玻恩/表面积溶剂化模型)的性能,探索了从蛋白质环构象的诱饵集合中进行模型选择的准确性。通过一种名为RAPPER的新型从头算方法,为一组长度为2至12个残基的大型蛋白质环测试集生成了大量具有理想几何结构的原子细节表示的一致环构象集合,该方法依赖于用于构象采样的细粒度残基特异性phi/psi倾向表。根据RAPDF分数对构象进行排序,得到的选定构象中,所有重原子主链原子的平均全局非叠加RMSD值范围为:4聚体为1.2 Å,8聚体为2.9 Å,12聚体为6.2 Å。在根据锚定几何结构和RAPDF分数进行筛选后,通过AMBER/GBSA势能函数的能量最小化进行排序,得到的选定构象中,4聚体的全局RMSD值为0.5 Å,8聚体为2.3 Å,12聚体为5.0 Å。最小化后的片段平均RMSD值比其初始构象始终低(低0.1 Å)。广义玻恩溶剂化能项的重要性体现在这样的观察结果中:当省略该项时,所有环长度的平均RMSD准确性都会变差。然而,仍有许多情况是AMBER气相最小化选择的构象的RMSD比AMBER/GBSA最小化的构象更低。与RAPDF相比,AMBER/GBSA能量函数与天然构象的RMSD具有更好的相关性。当集合中补充了从实验结构中提取的构象时,在用AMBER/GBSA力场评分时,在较长长度(8聚体的平均RMSD为1.3 Å)下观察到选择准确性有显著提高。这项工作为一种有前途的从头算和基于知识的方法相结合的环建模混合方法提供了基础。