Multicase Inc., 23811 Chagrin Boulevard, Suite 305, Beachwood, Ohio 44122, USA.
J Chem Inf Model. 2012 Oct 22;52(10):2609-18. doi: 10.1021/ci300111r. Epub 2012 Sep 18.
Fragment based expert system models of toxicological end points are primarily comprised of a set of substructures that are statistically related to the toxic property in question. These special substructures are often referred to as toxicity alerts, toxicophores, or biophores. They are the main building blocks/classifying units of the model, and it is important to define the chemical structural space within which the alerts are expected to produce reliable predictions. Furthermore, defining an appropriate applicability domain is required as part of the OECD guidelines for the validation of quantitative structure-activity relationships (QSARs). In this respect, this paper describes a method to construct applicability domains for individual toxicity alerts that are part of the CASE Ultra expert system models. Defining applicability domain for individual alerts was necessary because each CASE Ultra model is comprised of multiple alerts, and different alerts of a model usually represent different toxicity mechanisms and cover different structural space; the use of an applicability domain for the overall model is often not adequate. The domain for each alert was constructed using a set of fragments that were found to be statistically related to the end point in question as opposed to using overall structural similarity or physicochemical properties. Use of the applicability domains in reducing false positive predictions is demonstrated. It is now possible to obtain ROC (receiver operating characteristic) profiles of CASE Ultra models by applying domain adherence cutoffs on the alerts identified in test chemicals. This helps in optimizing the performance of a model based on their true positive-false positive prediction trade-offs and reduce drastic effects on the predictive performance caused by the active/inactive ratio of the model's training set. None of the major currently available commercial expert systems for toxicity prediction offer the possibility to explore a model's full range of sensitivity-specificity spectrum, and therefore, the methodology developed in this study can be of benefit in improving the predictive ability of the alert based expert systems.
基于片段的毒理学终点专家系统模型主要由一组与毒理性质具有统计学相关性的子结构组成。这些特殊的子结构通常被称为毒性警示、毒性基团或生物基团。它们是模型的主要构建块/分类单元,定义警示预计能够产生可靠预测的化学结构空间非常重要。此外,作为 OECD 定量构效关系(QSAR)验证指南的一部分,需要定义适当的适用性域。在这方面,本文描述了一种为 CASE Ultra 专家系统模型中包含的个别毒性警示构建适用性域的方法。为个别警示定义适用性域是必要的,因为每个 CASE Ultra 模型都由多个警示组成,并且模型的不同警示通常代表不同的毒性机制并涵盖不同的结构空间;使用整个模型的适用性域通常是不够的。使用与终点具有统计学相关性的一组片段来构建每个警示的域,而不是使用整体结构相似性或物理化学性质。展示了在减少假阳性预测中的应用领域。现在可以通过在受试化学品中应用域遵守截止值来获得 CASE Ultra 模型的 ROC(接收者操作特征)曲线。这有助于根据其真阳性-假阳性预测权衡来优化模型的性能,并减少模型训练集的活性/非活性比例对预测性能的剧烈影响。目前没有任何主要的商业毒性预测专家系统提供探索模型全灵敏度-特异性谱的可能性,因此,本研究中开发的方法可以提高基于警示的专家系统的预测能力。