Basant Nikita, Gupta Shikha, Singh Kunwar P
ETRC , Gomtinagar , Lucknow-226 010 , India.
Environmental Chemistry Division , CSIR-Indian Institute of Toxicology Research , Post Box 80 , Mahatma Gandhi Marg , Lucknow-226 001 , India . Email:
Toxicol Res (Camb). 2016 Apr 26;5(4):1029-1038. doi: 10.1039/c6tx00083e. eCollection 2016 Jul 1.
The experimental determination of multi-generation reproductive toxicity of chemicals involves high costs and a large number of animal studies over a long period of time. Computational toxicology offers possibilities to overcome such difficulties. In this study, we have established ensemble machine learning (EML) based quantitative structure-activity relationship models for predicting the reproductive toxicity potential (LOAEL) of structurally diverse chemicals in accordance with the OECD guidelines. Accordingly, decision tree forest (DTF) and decision tree boost (DTB) QSAR models were developed using a novel dataset composed of the toxicity endpoints for 334 chemicals. Relevant structural features of chemicals responsible for toxicity potential were identified and used in QSAR modeling. The generalization and prediction abilities of the constructed QSAR models were evaluated by internal and external validation procedures and by deriving several stringent statistical criteria parameters. In the test set, the two models (DTF and DTB) yielded of 0.856 and 0.945, between the experimental and predicted endpoint toxicity values. The models were also evaluated for predictive use through the most recent criteria based on root mean squared error (RMSE) and mean absolute error (MAE). The values of various statistical validation coefficients derived for the test data were above their respective threshold limits and thus put a high confidence in this analysis. The applicability domains of the constructed QSAR models were defined using the leverage and standardization approaches. The results suggest that the proposed QSAR models can reliably predict the reproductive toxicity potential of diverse chemicals and can be useful tools for screening new chemicals for safety assessment.
化学品多代生殖毒性的实验测定涉及高昂成本以及长时间的大量动物研究。计算毒理学为克服此类困难提供了可能性。在本研究中,我们根据经合组织指南,建立了基于集成机器学习(EML)的定量构效关系模型,用于预测结构多样的化学品的生殖毒性潜力(最低观察到有害作用水平)。据此,使用由334种化学品的毒性终点组成的新数据集开发了决策树森林(DTF)和决策树增强(DTB)定量构效关系模型。确定了导致毒性潜力的化学品相关结构特征,并将其用于定量构效关系建模。通过内部和外部验证程序以及推导几个严格的统计标准参数,评估了所构建的定量构效关系模型的泛化能力和预测能力。在测试集中,两个模型(DTF和DTB)在实验终点毒性值与预测终点毒性值之间产生的 分别为0.856和0.945。还通过基于均方根误差(RMSE)和平均绝对误差(MAE)的最新标准对模型的预测用途进行了评估。为测试数据得出的各种统计验证系数的值均高于其各自的阈值极限,因此对该分析具有很高的置信度。使用杠杆率和标准化方法定义了所构建的定量构效关系模型的适用域。结果表明,所提出的定量构效关系模型可以可靠地预测各种化学品的生殖毒性潜力,并且可以成为筛选新化学品进行安全评估的有用工具。