GlaxoSmithKline, Gunnels Wood Road Stevenage Herts SG1 2NY, United Kingdom.
Department of Medicinal Chemistry, Medical University of Lublin, Poland.
Regul Toxicol Pharmacol. 2022 Mar;129:105109. doi: 10.1016/j.yrtph.2021.105109. Epub 2021 Dec 27.
Several public efforts are aimed at discovering patterns or classifiers in the high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. The current study sought to assess and compare the predictions of the Globally Harmonized System (GHS) categories and Dangerous Goods (DG) classifications based on Lethal Dose (LD50) from several available tools (ACD/Labs, Leadscope, T.E.S.T., CATMoS, CaseUltra). External validation was done using dataset of 375 substances to demonstrate their predictive capacity. All models showed very good performance for identifying non-toxic compounds, which would be useful for DG classification, developing or triaging new chemicals, prioritizing existing chemicals for more detailed and rigorous toxicity assessments, and assessing non-active pharmaceutical intermediates. This would ultimately reduce animal use and improve risk assessments. Category-to-category prediction was not optimal, mainly due to the tendency to overpredict the outcome and the general limitations of acute oral toxicity (AOT) in vivo studies. Overprediction does not specifically pose a risk to human health, it can impact transport and material packaging requirements. Performance for compounds with LD50 ≤ 300 mg/kg (approx. 5% of the dataset) was the poorest among all groups and could be potentially improved by including expert review and read-across to similar substances.
多项旨在发现高维生物活性空间中可预测组织、器官或整体动物毒理学终点的模式或分类器的公共努力。本研究旨在评估和比较基于几种现有工具(ACD/Labs、Leadscope、T.E.S.T.、CATMoS、CaseUltra)获得的全球协调系统 (GHS) 类别和危险货物 (DG) 分类的预测结果。使用 375 种物质的数据集进行外部验证,以证明其预测能力。所有模型在识别非毒性化合物方面均表现出非常好的性能,这对于 DG 分类、开发或筛选新化学品、为更详细和严格的毒性评估确定现有化学品的优先级以及评估非活性药物中间体非常有用。这最终将减少动物使用并改善风险评估。类别间的预测并不理想,主要是由于倾向于过度预测结果以及体内急性口服毒性 (AOT) 研究的一般局限性。过度预测不会对人类健康构成特定风险,但会影响运输和材料包装要求。对于 LD50≤300mg/kg(约占数据集的 5%)的化合物,所有组的性能最差,通过包括专家审查和对相似物质的读码,可以潜在地改善。