Paluszewski Martin, Karplus Kevin
Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.
Proteins. 2009 May 15;75(3):540-9. doi: 10.1002/prot.22262.
Given a set of alternative models for a specific protein sequence, the model quality assessment (MQA) problem asks for an assignment of scores to each model in the set. A good MQA program assigns these scores such that they correlate well with real quality of the models, ideally scoring best that model which is closest to the true structure. In this article, we present a new approach for addressing the MQA problem. It is based on distance constraints extracted from alignments to templates of known structure, and is implemented in the Undertaker program for protein structure prediction. One novel feature is that we extract noncontact constraints as well as contact constraints. We describe how the distance constraint extraction is done and we show how they can be used to address the MQA problem. We have compared our method on CASP7 targets and the results show that our method is at least comparable with the best MQA methods that were assessed at CASP7. We also propose a new evaluation measure, Kendall's tau, that is more interpretable than conventional measures used for evaluating MQA methods (Pearson's r and Spearman's rho). We show clear examples where Kendall's tau agrees much more with our intuition of a correct MQA, and we therefore propose that Kendall's tau be used for future CASP MQA assessments.
给定一组针对特定蛋白质序列的替代模型,模型质量评估(MQA)问题要求为该集合中的每个模型分配分数。一个好的MQA程序分配这些分数时,应使其与模型的实际质量有良好的相关性,理想情况下,对最接近真实结构的模型给出最高分。在本文中,我们提出了一种解决MQA问题的新方法。它基于从与已知结构模板的比对中提取的距离约束,并在用于蛋白质结构预测的Undertaker程序中实现。一个新颖的特点是,我们不仅提取接触约束,还提取非接触约束。我们描述了距离约束的提取方法,并展示了如何使用它们来解决MQA问题。我们在CASP7目标上对我们的方法进行了比较,结果表明我们的方法至少与在CASP7中评估的最佳MQA方法相当。我们还提出了一种新的评估指标,肯德尔秩相关系数(Kendall's tau),它比用于评估MQA方法的传统指标(皮尔逊相关系数(Pearson's r)和斯皮尔曼等级相关系数(Spearman's rho))更具可解释性。我们给出了清晰的例子,说明肯德尔秩相关系数与我们对正确MQA的直觉更加一致,因此我们建议在未来的CASP MQA评估中使用肯德尔秩相关系数。