Leadscope, Inc., 1393 Dublin Road, Columbus, OH, 43215.
Toxicol Mech Methods. 2008;18(2-3):277-95. doi: 10.1080/15376510701857502.
ABSTRACT Genetic toxicity data from various sources were integrated into a rigorously designed database using the ToxML schema. The public database sources include the U.S. Food and Drug Administration (FDA) submission data from approved new drug applications, food contact notifications, generally recognized as safe food ingredients, and chemicals from the NTP and CCRIS databases. The data from public sources were then combined with data from private industry according to ToxML criteria. The resulting "integrated" database, enriched in pharmaceuticals, was used for data mining analysis. Structural features describing the database were used to differentiate the chemical spaces of drugs/candidates, food ingredients, and industrial chemicals. In general, structures for drugs/candidates and food ingredients are associated with lower frequencies of mutagenicity and clastogenicity, whereas industrial chemicals as a group contain a much higher proportion of positives. Structural features were selected to analyze endpoint outcomes of the genetic toxicity studies. Although most of the well-known genotoxic carcinogenic alerts were identified, some discrepancies from the classic Ashby-Tennant alerts were observed. Using these influential features as the independent variables, the results of four types of genotoxicity studies were correlated. High Pearson correlations were found between the results of Salmonella mutagenicity and mouse lymphoma assay testing as well as those from in vitro chromosome aberration studies. This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies.
利用 ToxML 架构,将来自各种来源的遗传毒性数据整合到一个严格设计的数据库中。公共数据库来源包括美国食品和药物管理局(FDA)批准新药申请、食品接触通知、公认安全的食品成分以及 NTP 和 CCRIS 数据库中的化学品的提交数据。然后根据 ToxML 标准将来自公共来源的数据与私营行业的数据相结合。由此产生的“综合”数据库富含药物,用于数据挖掘分析。描述数据库的结构特征用于区分药物/候选物、食品成分和工业化学品的化学空间。一般来说,药物/候选物和食品成分的结构与较低的致突变性和致畸性频率相关,而工业化学品作为一个整体包含更高比例的阳性结果。选择结构特征来分析遗传毒性研究的终点结果。尽管确定了大多数众所周知的遗传毒性致癌警示,但观察到与经典 Ashby-Tennant 警示存在一些差异。使用这些有影响力的特征作为自变量,对四种类型的遗传毒性研究的结果进行了相关性分析。在沙门氏菌致突变性和小鼠淋巴瘤测定以及体外染色体畸变研究的结果之间发现了高度的 Pearson 相关性。本文证明了用化学物质的结构特征来表示化学物质的有用性,以及使用这些特征来对一系列测试进行分析,而不是依赖于给定化学物质的单一毒性测试。本文提出了数据挖掘/分析方法,应用于证据权重评估遗传毒性的潜在风险,并指导智能测试策略的制定。