Schultz T W, Netzeva T I, Cronin M T D
Department of Comparative Medicine, College of Veterinary Medicine, The University of Tennessee, Knoxville, TN 37996-4543, USA.
SAR QSAR Environ Res. 2004 Oct-Dec;15(5-6):385-97. doi: 10.1080/10629360412331297344.
Validation of a quantitative structure-activity relationship (QSAR) is now considered as an integral part of its development. Assessment of the quality of a QSAR and the confidence that may be placed in predictions from it are vital to any validation procedure. A number of terms associated with the quality of a QSAR, confidence in that QSAR, or both may be quantified. These terms include the: (1) goodness of fit of the model (r2); (2) predictivity of the model (Q2); (3) stability of the model described as the difference between fit and predictivity (Dfp); (4) number of compounds used in the training set (Nc); (5) number of descriptors used in the model (Nd); (6) range of toxicity values (Tr); (7) number of mechanisms of toxic action covered by the training set (Nm), as well as two factors associated with the biological data-confidence associated with, (8) reproducibility of the data (Rconf) and (9) confidence in the source of the data (Sconf). While all these factors may influence the quality of, and/or confidence in a particular QSAR, each varies within different limits. To enable a quantitative assessment of quality and confidence in a QSAR, the terms deemed to be important were weighed and combined to create a Confidence Index (CI): ((r2)4 x 6) x ((Q2)4 x 6) x (ln(Nc/10)) x (Tr) x (Sconf)0.5 (ln(N2d + 2)) x (ln(N2m + 2)) x ((r2)4 x 6) - ((Q2)4 x 6) + 1) x (Rconf)