Monastyrskyy Bohdan, D'Andrea Daniel, Fidelis Krzysztof, Tramontano Anna, Kryshtafovych Andriy
Genome Center, University of California, Davis, California, 95616.
Proteins. 2014 Feb;82 Suppl 2(0 2):138-53. doi: 10.1002/prot.24340. Epub 2013 Aug 31.
We present the results of the assessment of the intramolecular residue-residue contact predictions from 26 prediction groups participating in the 10th round of the CASP experiment. The most recently developed direct coupling analysis methods did not take part in the experiment likely because they require a very deep sequence alignment not available for any of the 114 CASP10 targets. The performance of contact prediction methods was evaluated with the measures used in previous CASPs (i.e., prediction accuracy and the difference between the distribution of the predicted contacts and that of all pairs of residues in the target protein), as well as new measures, such as the Matthews correlation coefficient, the area under the precision-recall curve and the ranks of the first correctly and incorrectly predicted contact. We also evaluated the ability to detect interdomain contacts and tested whether the difficulty of predicting contacts depends upon the protein length and the depth of the family sequence alignment. The analyses were carried out on the target domains for which structural homologs did not exist or were difficult to identify. The evaluation was performed for all types of contacts (short, medium, and long-range), with emphasis placed on long-range contacts, i.e. those involving residues separated by at least 24 residues along the sequence. The assessment suggests that the best CASP10 contact prediction methods perform at approximately the same level, and comparably to those participating in CASP9.
我们展示了对参与第10轮CASP实验的26个预测小组所做的分子内残基-残基接触预测的评估结果。最新开发的直接耦合分析方法没有参与该实验,可能是因为它们需要非常深度的序列比对,而这对于114个CASP10目标中的任何一个都不可用。接触预测方法的性能通过先前CASP中使用的指标(即预测准确性以及预测接触的分布与目标蛋白中所有残基对的分布之间的差异)以及新的指标进行评估,如新的指标,如马修斯相关系数、精确召回率曲线下的面积以及首次正确和错误预测接触的排名。我们还评估了检测结构域间接触的能力,并测试了预测接触的难度是否取决于蛋白质长度和家族序列比对的深度。分析是在不存在结构同源物或难以识别结构同源物的目标结构域上进行的。对所有类型的接触(短程、中程和长程)进行了评估,重点是长程接触,即那些沿着序列至少相隔24个残基的残基之间的接触。评估表明,最好的CASP10接触预测方法的表现大致处于同一水平,与参与CASP9的方法相当。