Schüürmann Gerrit, Ebert Ralf-Uwe, Chen Jingwen, Wang Bin, Kühne Ralph
Department of Ecological Chemistry, UFZ Helmholtz Centre for Environmental Research, Permoserstrasse 15, 04318 Leipzig, Germany.
J Chem Inf Model. 2008 Nov;48(11):2140-5. doi: 10.1021/ci800253u.
The external prediction capability of quantitative structure-activity relationship (QSAR) models is often quantified using the predictive squared correlation coefficient, q (2). This index relates the predictive residual sum of squares, PRESS, to the activity sum of squares, SS, without postprocessing of the model output, the latter of which is automatically done when calculating the conventional squared correlation coefficient, r (2). According to the current OECD guidelines, q (2) for external validation should be calculated with SS referring to the training set activity mean. Our present findings including a mathematical proof demonstrate that this approach yields a systematic overestimation of the prediction capability that is triggered by the difference between the training and test set activity means. Example calculations with three regression models and data sets taken from literature show further that for external test sets, q (2) based on the training set activity mean may become even larger than r (2). As a consequence, we suggest to always use the test set activity mean when quantifying the external prediction capability through q (2) and to revise the respective OECD guidance document accordingly. The discussion includes a comparison between r (2) and q (2) value ranges and the q (2) statistics for cross-validation.
定量构效关系(QSAR)模型的外部预测能力通常使用预测平方相关系数q(2)来量化。该指标将预测残差平方和(PRESS)与活性平方和(SS)相关联,无需对模型输出进行后处理,而在计算传统平方相关系数r(2)时会自动进行后处理。根据当前经合组织的指导方针,外部验证的q(2)应使用参照训练集活性均值的SS来计算。我们目前的研究结果包括一个数学证明,表明这种方法会导致对预测能力的系统性高估,这是由训练集和测试集活性均值之间的差异引发的。对三个回归模型以及取自文献的数据集进行的示例计算进一步表明,对于外部测试集,基于训练集活性均值的q(2)可能会变得甚至大于r(2)。因此,我们建议在通过q(2)量化外部预测能力时始终使用测试集活性均值,并相应地修订经合组织的相关指导文件。讨论内容包括r(2)和q(2)值范围的比较以及交叉验证的q(2)统计量。