Kolossov E, Stanforth R
ID Business Solutions Ltd., 2 Occam Court, Occam Road, Surrey Research Park, Guildford, Surrey, UK.
SAR QSAR Environ Res. 2007 Jan-Mar;18(1-2):89-100. doi: 10.1080/10629360601053984.
Assessment of the quality of goodness-of-fit and the confidence in predictivity (prediction power) are the main terms used to define the statistical quality of QSAR models. Three parts of this assessment can be defined as: (1) Measure of goodness-of-fit. (2) Validation of model stability. (3) Predictivity analysis. Currently there are no mandatory requirements for the validation methods to be used and rules for the quantitative confidence estimates. To compare the statistical quality of QSAR models it is necessary to have an overall statistical quality index which will depend on the goodness-of-fit, validation and predictivity results together. To do so it is necessary to define the set of mandatory parameters for all three parts of assessment listed above and develop the approach for overall quality estimates based on these parameters. It is also necessary to include into the overall index the penalty mechanism for parameter absence. The goal of the present study is to analyse parameters for all three parts of the QSAR model statistical quality assessment and investigate the flexible weighting approach for the overall statistical quality index development. Due the different statistical parameters traditionally used for assessment of goodness-of-fit it is necessary to create the mechanism, which allows flexible set of parameters to be used for the overall statistical quality index. Only after approval by scientific community and regulatory boards the final set of mandatory parameters can be selected.
评估拟合优度的质量以及对预测性(预测能力)的置信度是用于定义定量构效关系(QSAR)模型统计质量的主要术语。该评估的三个部分可定义为:(1)拟合优度的度量。(2)模型稳定性的验证。(3)预测性分析。目前,对于所使用的验证方法和定量置信度估计规则没有强制性要求。为了比较QSAR模型的统计质量,有必要有一个整体统计质量指标,该指标将取决于拟合优度、验证和预测性结果。为此,有必要为上述评估的所有三个部分定义一组强制性参数,并基于这些参数开发整体质量估计方法。还需要在整体指标中纳入参数缺失的惩罚机制。本研究的目的是分析QSAR模型统计质量评估所有三个部分的参数,并研究用于开发整体统计质量指标的灵活加权方法。由于传统上用于评估拟合优度的统计参数不同,有必要创建一种机制,该机制允许将灵活的参数集用于整体统计质量指标。只有在得到科学界和监管委员会的批准后,才能选择最终的强制性参数集。