Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
Mini Rev Med Chem. 2012 Jun;12(6):491-504. doi: 10.2174/138955712800493861.
Validation of quantitative structure-activity relationship (QSAR) models plays a key role for the selection of robust and predictive models that may be employed for further activity prediction of new molecules. Traditionally, QSAR models are validated based on classical metrics for internal (Q²) and external validation (R² pred). Recently, it has been shown that for data sets with wide range of the response variable, these traditional metrics tend to achieve high values without truly reflecting absolute differences between the observed and predicted response values, as in both cases the reference for comparison of the predicted residuals is the deviations of the observed values from the training set mean. Roy et al. have recently developed a new parameter, modified r² (rm²), which considers the actual difference between the observed and predicted response data without consideration of training set mean thereby serving as a more stringent measure for assessment of model predictivity compared to the traditional validation parameters (Q² and R² pred). The rm² parameter has three different variants: (i) rm² (LOO) for internal validation, (ii) rm² (test) for external validation and (iii) rm² (overall) for analyzing the overall performance of the developed model considering predictions for both internal and external validation sets. Thus, the rm² metrics strictly judge the ability of a QSAR model to predict the activity/toxicity of untested molecules. The present review provides a survey of the development of different rm² metrics followed by their applications in modeling studies for selection of the best QSAR models in different reports made by several workers.
定量构效关系 (QSAR) 模型的验证对于选择稳健且可用于进一步预测新分子活性的预测模型起着关键作用。传统上,QSAR 模型是基于内部 (Q²) 和外部验证 (R² pred) 的经典指标进行验证的。最近,已经表明对于响应变量范围广泛的数据,这些传统指标往往会获得很高的值,而没有真正反映观察值和预测响应值之间的绝对差异,因为在这两种情况下,预测残差的参考是观察值与训练集平均值的偏差。Roy 等人最近开发了一个新参数,即修正 r²(rm²),它考虑了观察值和预测响应数据之间的实际差异,而不考虑训练集平均值,因此与传统验证参数 (Q² 和 R² pred) 相比,它是评估模型预测能力的更严格的指标。rm² 参数有三个不同的变体:(i)内部验证的 rm² (LOO),(ii)外部验证的 rm² (test),以及(iii)考虑内部和外部验证集预测的开发模型整体性能的 rm² (overall)。因此,rm² 指标严格评判了 QSAR 模型预测未测试分子活性/毒性的能力。本综述介绍了不同 rm² 指标的发展情况,以及它们在不同报告中用于选择最佳 QSAR 模型的建模研究中的应用情况,这些报告是由几位工作者撰写的。