Jaworska Joanna, Nikolova-Jeliazkova Nina, Aldenberg Tom
Procter and Gamble, Eurocor, Central Product Safety, 100 Temselaan, 1853 Strombeek-Bever, Belgium.
Altern Lab Anim. 2005 Oct;33(5):445-59. doi: 10.1177/026119290503300508.
As the use of Quantitative Structure Activity Relationship (QSAR) models for chemical management increases, the reliability of the predictions from such models is a matter of growing concern. The OECD QSAR Validation Principles recommend that a model should be used within its applicability domain (AD). The Setubal Workshop report provided conceptual guidance on defining a (Q)SAR AD, but it is difficult to use directly. The practical application of the AD concept requires an operational definition that permits the design of an automatic (computerised), quantitative procedure to determine a models AD. An attempt is made to address this need, and methods and criteria for estimating AD through training set interpolation in descriptor space are reviewed. It is proposed that response space should be included in the training set representation. Thus, training set chemicals are points in n-dimensional descriptor space and m-dimensional model response space. Four major approaches for estimating interpolation regions in a multivariate space are reviewed and compared: range, distance, geometrical, and probability density distribution.
随着用于化学品管理的定量构效关系(QSAR)模型的使用不断增加,此类模型预测的可靠性日益受到关注。经合组织的QSAR验证原则建议,模型应在其适用域(AD)内使用。塞图巴尔研讨会报告就定义(Q)SAR适用域提供了概念性指导,但难以直接应用。适用域概念的实际应用需要一个操作性定义,以便设计一种自动(计算机化)的定量程序来确定模型的适用域。本文试图满足这一需求,并对通过描述符空间中的训练集插值来估计适用域的方法和标准进行了综述。建议在训练集表示中纳入响应空间。因此,训练集化学品是n维描述符空间和m维模型响应空间中的点。本文综述并比较了在多变量空间中估计插值区域的四种主要方法:范围法、距离法、几何法和概率密度分布法。