Rajgaria R, McAllister S R, Floudas C A
Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA.
Proteins. 2006 Nov 15;65(3):726-41. doi: 10.1002/prot.21149.
This work presents a novel C(alpha)--C(alpha) distance dependent force field which is successful in selecting native structures from an ensemble of high resolution near-native conformers. An enhanced and diverse protein set, along with an improved decoy generation technique, contributes to the effectiveness of this potential. High quality decoys were generated for 1489 nonhomologous proteins and used to train an optimization based linear programming formulation. The goal in developing a set of high resolution decoys was to develop a simple, distance-dependent force field that yields the native structure as the lowest energy structure and assigns higher energies to decoy structures that are quite similar as well as those that are less similar. The model also includes a set of physical constraints that were based on experimentally observed physical behavior of the amino acids. The force field was tested on two sets of test decoys not in the training set and was found to excel on all the metrics that are widely used to measure the effectiveness of a force field. The high resolution force field was successful in correctly identifying 113 native structures out of 150 test cases and the average rank obtained for this test was 1.87. All the high resolution structures (training and testing) used for this work are available online and can be downloaded from http://titan.princeton.edu/HRDecoys.
这项工作提出了一种新颖的依赖于C(α)-C(α)距离的力场,该力场能够成功地从一组高分辨率近天然构象中选择天然结构。一组经过增强且多样化的蛋白质,以及一种改进的诱饵生成技术,共同促成了这种势能的有效性。针对1489种非同源蛋白质生成了高质量的诱饵,并用于训练基于优化的线性规划公式。开发一组高分辨率诱饵的目标是开发一种简单的、依赖于距离的力场,该力场能将天然结构作为能量最低的结构产生,并为相似度较高以及较低的诱饵结构赋予更高的能量。该模型还包括一组基于氨基酸实验观察到的物理行为的物理约束。该力场在两组未包含在训练集中的测试诱饵上进行了测试,结果发现在广泛用于衡量力场有效性的所有指标上表现出色。在150个测试案例中,高分辨率力场成功地正确识别出了113个天然结构,此次测试获得的平均排名为1.87。这项工作中使用的所有高分辨率结构(训练和测试)均可在线获取,可从http://titan.princeton.edu/HRDecoys下载。