Lane Thomas R, Harris Joshua, Urbina Fabio, Ekins Sean
Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA.
J Chem Health Saf. 2023 Mar 27;30(2):83-97. doi: 10.1021/acs.chas.2c00088. Epub 2023 Feb 23.
The lethal dose or concentration which kills 50% of the animals (LD or LC) is an important parameter for scientists to understand the toxicity of chemicals in different scenarios that can be used to make go-no-go decisions, and ultimately assist in the choice of the right personal protective equipment needed for containment. The LD assessment process has also required the use of many animals although modern methods have reduced the number of rats needed. Since a compound is usually considered highly toxic when the LD is lower than 25 mg/kg, such a classification provides potentially valuable safety information to synthetic chemists and other safety assessment scientists. The need for finding alternative approaches such as computational methods is important to ultimately reduce animal use for this testing further still. We now summarize our efforts to use public data for building LD or LC classification and regression machine learning models for various species (rat, mouse, fish and daphnia) and their 5-fold cross validation statistics with different machine learning algorithms as well as an external curated test set for mouse LD. These datasets consist of different molecule classes, may cover different activity ranges, and also have a range of dataset sizes. The challenges of using such computational models are that their applicability domain will also need to be understood so that they can be used to make reliable predictions for novel molecules. These machine learning models will also need to be backed up with experimental validation. However, such models could also be used for efforts to bridge gaps in individual toxicity datasets. Making such models available also opens them up to potential misuse or dual use. We will summarize these efforts and propose that they could be used for scoring the millions of commercially available molecules, most of which likely do not have a known LD or for that matter any data or for toxicity.
杀死50%动物的致死剂量或浓度(LD或LC)是科学家了解化学品在不同场景下毒性的重要参数,可用于做出通过与否的决策,并最终有助于选择遏制所需的合适个人防护装备。LD评估过程过去也需要使用大量动物,尽管现代方法已减少了所需大鼠的数量。由于当LD低于25毫克/千克时,化合物通常被认为具有高毒性,这种分类为合成化学家及其他安全评估科学家提供了潜在有价值的安全信息。寻找诸如计算方法等替代方法对于进一步最终减少用于该测试的动物使用量很重要。我们现在总结我们利用公共数据构建针对各种物种(大鼠、小鼠、鱼类和水蚤)的LD或LC分类及回归机器学习模型的努力,以及它们使用不同机器学习算法的5倍交叉验证统计数据,还有针对小鼠LD的外部精选测试集。这些数据集由不同的分子类别组成,可能涵盖不同的活性范围,并且数据集大小也各不相同。使用此类计算模型的挑战在于还需要了解其适用范围,以便能够用于对新分子进行可靠预测。这些机器学习模型也需要通过实验验证来支持。然而,此类模型也可用于弥合个体毒性数据集之间的差距。提供此类模型也可能导致其被滥用或两用。我们将总结这些努力,并提出它们可用于对数百万种市售分子进行评分,其中大多数可能没有已知的LD,或者就此而言没有任何数据或毒性数据。