Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA.
J Comput Chem. 2012 Dec 5;33(31):2483-91. doi: 10.1002/jcc.23069. Epub 2012 Jul 27.
All-atom sampling is a critical and compute-intensive end stage to protein structural modeling. Because of the vast size and extreme ruggedness of conformational space, even close to the native structure, the high-resolution sampling problem is almost as difficult as predicting the rough fold of a protein. Here, we present a combination of new algorithms that considerably speed up the exploration of very rugged conformational landscapes and are capable of finding heretofore hidden low-energy states. The algorithm is based on a hierarchical workflow and can be parallelized on supercomputers with up to 128,000 compute cores with near perfect efficiency. Such scaling behavior is notable, as with Moore's law continuing only in the number of cores per chip, parallelizability is a critical property of new algorithms. Using the enhanced sampling power, we have uncovered previously invisible deficiencies in the Rosetta force field and created an extensive decoy training set for optimizing and testing force fields.
全原子采样是蛋白质结构建模的关键和计算密集型的最后阶段。由于构象空间的巨大规模和极端崎岖,即使接近天然结构,高分辨率采样问题也几乎和预测蛋白质的大致折叠一样困难。在这里,我们提出了一种新算法的组合,可以大大加快对非常崎岖构象景观的探索,并能够找到迄今为止隐藏的低能状态。该算法基于分层工作流程,可以在多达 128000 个计算核的超级计算机上进行并行化,效率接近完美。这种扩展行为非常显著,因为随着摩尔定律仅在芯片上的核心数量上继续发展,并行性是新算法的关键特性。利用增强的采样能力,我们发现了 Rosetta 力场以前看不见的缺陷,并创建了一个广泛的诱饵训练集,用于优化和测试力场。