Handl Julia, Knowles Joshua, Lovell Simon C
Faculty of Life Sciences, University of Manchester, Manchester, UK.
Bioinformatics. 2009 May 15;25(10):1271-9. doi: 10.1093/bioinformatics/btp150. Epub 2009 Mar 17.
Decoy datasets, consisting of a solved protein structure and numerous alternative native-like structures, are in common use for the evaluation of scoring functions in protein structure prediction. Several pitfalls with the use of these datasets have been identified in the literature, as well as useful guidelines for generating more effective decoy datasets. We contribute to this ongoing discussion an empirical assessment of several decoy datasets commonly used in experimental studies.
We find that artefacts and sampling issues in the large majority of these data make it trivial to discriminate the native structure. This underlines that evaluation based on the rank/z-score of the native is a weak test of scoring function performance. Moreover, sampling biases present in the way decoy sets are generated or used can strongly affect other types of evaluation measures such as the correlation between score and root mean squared deviation (RMSD) to the native. We demonstrate how, depending on type of bias and evaluation context, sampling biases may lead to both over- or under-estimation of the quality of scoring terms, functions or methods.
Links to the software and data used in this study are available at http://dbkgroup.org/handl/decoy_sets.
诱饵数据集由一个已解析的蛋白质结构和众多类似天然结构的替代结构组成,常用于蛋白质结构预测中评分函数的评估。文献中已指出使用这些数据集存在的几个陷阱,以及生成更有效诱饵数据集的有用指导原则。我们为这一正在进行的讨论贡献了对实验研究中常用的几个诱饵数据集的实证评估。
我们发现这些数据中绝大多数存在的人为因素和采样问题使得区分天然结构变得轻而易举。这突出表明,基于天然结构的排名/z分数进行评估对评分函数性能的测试力度较弱。此外,在生成或使用诱饵集的方式中存在的采样偏差会强烈影响其他类型的评估指标,例如分数与到天然结构的均方根偏差(RMSD)之间的相关性。我们展示了根据偏差类型和评估背景,采样偏差如何可能导致对评分项、函数或方法质量的高估或低估。
本研究中使用的软件和数据的链接可在http://dbkgroup.org/handl/decoy_sets获取。