Departments of Biophysics and Biochemistry, Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas.
Genome Center, University of California, Davis, California.
Proteins. 2019 Dec;87(12):1021-1036. doi: 10.1002/prot.25775. Epub 2019 Jul 24.
Protein target structures for the Critical Assessment of Structure Prediction round 13 (CASP13) were split into evaluation units (EUs) based on their structural domains, the domain organization of available templates, and the performance of servers on whole targets compared to split target domains. Eighty targets were split into 112 EUs. The EUs were classified into categories suitable for assessment of high accuracy modeling (or template-based modeling [TBM]) and topology (or free modeling [FM]) based on target difficulty. Assignment into assessment categories considered the following criteria: (a) the evolutionary relationship of target domains to existing fold space as defined by the Evolutionary Classification of Protein Domains (ECOD) database; (b) the clustering of target domains using eight objective sequence, structure, and performance measures; and (c) the placement of target domains in a scatter plot of target difficulty against server performance used in the previous CASP. Generally, target domains with good server predictions had close template homologs and were classified as TBM. Alternately, targets with poor server predictions represent a mixture of fast evolving homologs, structure analogs, and new folds, and were classified as FM or FM/TBM overlap.
蛋白质结构预测关键评估第 13 轮(CASP13)的目标结构根据其结构域、可用模板的结构组织以及服务器在整个目标上的表现相对于拆分目标域进行了拆分。八十个目标被分为 112 个 EU。EU 根据目标难度分为适合评估高精度建模(或基于模板的建模 [TBM])和拓扑(或自由建模 [FM])的类别。评估类别的分配考虑了以下标准:(a)目标域与现有折叠空间的进化关系,由蛋白质结构域进化分类(ECOD)数据库定义;(b)使用八个客观序列、结构和性能度量对目标域进行聚类;(c)使用以前的 CASP 中使用的目标难度与服务器性能的散点图对目标域进行定位。通常,具有良好服务器预测的目标域具有密切的模板同源物,并被分类为 TBM。或者,具有较差服务器预测的目标代表了快速进化同源物、结构类似物和新折叠的混合物,并被分类为 FM 或 FM/TBM 重叠。