Institute of Biotechnology, Vilnius University, Graičiūno 8, LT-02241 Vilnius, Lithuania.
Proteins. 2013 Jan;81(1):149-62. doi: 10.1002/prot.24172. Epub 2012 Sep 29.
Evaluation of protein models against the native structure is essential for the development and benchmarking of protein structure prediction methods. Although a number of evaluation scores have been proposed to date, many aspects of model assessment still lack desired robustness. In this study we present CAD-score, a new evaluation function quantifying differences between physical contacts in a model and the reference structure. The new score uses the concept of residue-residue contact area difference (CAD) introduced by Abagyan and Totrov (J Mol Biol 1997; 268:678-685). Contact areas, the underlying basis of the score, are derived using the Voronoi tessellation of protein structure. The newly introduced CAD-score is a continuous function, confined within fixed limits, free of any arbitrary thresholds or parameters. The built-in logic for treatment of missing residues allows consistent ranking of models of any degree of completeness. We tested CAD-score on a large set of diverse models and compared it to GDT-TS, a widely accepted measure of model accuracy. Similarly to GDT-TS, CAD-score showed a robust performance on single-domain proteins, but displayed a stronger preference for physically more realistic models. Unlike GDT-TS, the new score revealed a balanced assessment of domain rearrangement, removing the necessity for different treatment of single-domain, multi-domain, and multi-subunit structures. Moreover, CAD-score makes it possible to assess the accuracy of inter-domain or inter-subunit interfaces directly. In addition, the approach offers an alternative to the superposition-based model clustering. The CAD-score implementation is available both as a web server and a standalone software package at http://www.ibt.lt/bioinformatics/cad-score/.
评估蛋白质模型与天然结构的吻合程度对于开发和基准测试蛋白质结构预测方法至关重要。尽管迄今为止已经提出了许多评估分数,但模型评估的许多方面仍然缺乏所需的稳健性。在本研究中,我们提出了 CAD 分数,这是一种新的评估函数,用于量化模型和参考结构中物理接触的差异。新的分数使用了 Abagyan 和 Totrov 提出的残基-残基接触面积差异(CAD)的概念(J Mol Biol 1997; 268:678-685)。接触面积是分数的基础,使用蛋白质结构的 Voronoi 细分来获得。新引入的 CAD 分数是一个连续函数,限制在固定范围内,没有任何任意的阈值或参数。用于处理缺失残基的内置逻辑允许对任何完整程度的模型进行一致的排名。我们在一组多样化的模型上测试了 CAD 分数,并将其与广泛接受的模型准确性衡量标准 GDT-TS 进行了比较。与 GDT-TS 类似,CAD 分数在单域蛋白上表现出稳健的性能,但对物理上更现实的模型表现出更强的偏好。与 GDT-TS 不同,新分数揭示了对结构域重排的平衡评估,无需对单域、多域和多亚基结构进行不同的处理。此外,CAD 分数使得评估结构域间或亚基间界面的准确性成为可能。此外,该方法提供了一种替代基于叠加的模型聚类的方法。CAD 分数的实现既可以作为网络服务器,也可以作为独立的软件包在 http://www.ibt.lt/bioinformatics/cad-score/ 上使用。