Liu Tianyun, Horst Jeremy A, Samudrala Ram
Department of Genetics, Stanford University, Stanford, California, USA.
Proteins. 2009 Oct;77(1):220-34. doi: 10.1002/prot.22434.
The principal bottleneck in protein structure prediction is the refinement of models from lower accuracies to the resolution observed by experiment. We developed a novel constraints-based refinement method that identifies a high number of accurate input constraints from initial models and rebuilds them using restrained torsion angle dynamics (rTAD). We previously created a Bayesian statistics-based residue-specific all-atom probability discriminatory function (RAPDF) to discriminate native-like models by measuring the probability of accuracy for atom type distances within a given model. Here, we exploit RAPDF to score (i.e., filter) constraints from initial predictions that may or may not be close to a native-like state, obtain consensus of top scoring constraints amongst five initial models, and compile sets with no redundant residue pair constraints. We find that this method consistently produces a large and highly accurate set of distance constraints from which to build refinement models. We further optimize the balance between accuracy and coverage of constraints by producing multiple structure sets using different constraint distance cutoffs, and note that the cutoff governs spatially near versus distant effects in model generation. This complete procedure of deriving distance constraints for rTAD simulations improves the quality of initial predictions significantly in all cases evaluated by us. Our procedure represents a significant step in solving the protein structure prediction and refinement problem, by enabling the use of consensus constraints, RAPDF, and rTAD for protein structure modeling and refinement.
蛋白质结构预测的主要瓶颈在于将较低精度的模型优化至实验观测到的分辨率。我们开发了一种基于约束的新型优化方法,该方法能从初始模型中识别出大量准确的输入约束,并使用受限扭转角动力学(rTAD)对其进行重建。我们之前创建了一种基于贝叶斯统计的残基特异性全原子概率判别函数(RAPDF),通过测量给定模型中原子类型距离的准确概率来区分类似天然状态的模型。在此,我们利用RAPDF对初始预测中的约束进行评分(即筛选),这些约束可能接近也可能不接近类似天然的状态,在五个初始模型中获得得分最高的约束的共识,并编译无冗余残基对约束的集合。我们发现,该方法始终能生成大量高度准确的距离约束集,用于构建优化模型。我们通过使用不同的约束距离截止值生成多个结构集,进一步优化了约束准确性和覆盖范围之间的平衡,并注意到截止值在模型生成中控制着空间上的近程与远程效应。在我们评估的所有情况下,这个为rTAD模拟推导距离约束的完整过程都显著提高了初始预测的质量。我们的方法代表了解决蛋白质结构预测和优化问题的重要一步,通过启用用于蛋白质结构建模和优化的共识约束、RAPDF和rTAD。