使用启发式多线性回归和启发式反向传播神经网络模拟化学物质对梨形四膜虫的毒性。

Modeling the toxicity of chemicals to Tetrahymena pyriformis using heuristic multilinear regression and heuristic back-propagation neural networks.

作者信息

Kahn Iiris, Sild Sulev, Maran Uko

机构信息

Institute of Chemistry, University of Tartu, 2 Jakobi Str., Tartu, Estonia.

出版信息

J Chem Inf Model. 2007 Nov-Dec;47(6):2271-9. doi: 10.1021/ci700231c. Epub 2007 Nov 7.

DOI:10.1021/ci700231c

PMID:17985864

Abstract

During the last years, considerable effort has been devoted to model the toxicity of chemicals to Tetrahymena pyriformis for medium and large sized data sets using various artificial neural network (ANN) techniques. Motivation behind this has been to model highly complex relationships with nonlinear character making it possible to describe wide structural diversity within one model. The current work compares the performance of two heuristic methods in developing quantitative structure-activity relationship (QSAR) models: the best multilinear regression (BMLR) approach and the heuristic back-propagation neural networks (hBNN). The modeling is based on a diverse data set of 1371 organic chemicals with toxicity data (log(1/IGC50)) collected from the literature. The toxicity values correspond to the static 40-h Tetrahymena pyriformis population growth impairment assay. The comparison of the two methods showed that the BMLR approach produces acceptable QSAR models (R2 = 0.726), whereas the hBNN method produced a statistically more significant model (R2 = 0.826) for the given endpoint. The hBNN method was able to relate different descriptors to the toxicity than the BMLR method. Both models were validated with an external prediction set. The descriptors in the models were analyzed and discussed.

摘要

在过去几年中，人们投入了大量精力，使用各种人工神经网络（ANN）技术，针对中型和大型数据集建立化学物质对梨形四膜虫毒性的模型。这样做的动机是为了建立具有非线性特征的高度复杂关系模型，从而能够在一个模型中描述广泛的结构多样性。当前的工作比较了两种启发式方法在开发定量构效关系（QSAR）模型时的性能：最佳多元线性回归（BMLR）方法和启发式反向传播神经网络（hBNN）。建模基于从文献中收集的1371种有机化学品的多样化数据集以及毒性数据（log(1/IGC50)）。毒性值对应于静态40小时梨形四膜虫种群生长抑制试验。两种方法的比较表明，BMLR方法产生了可接受的QSAR模型（R2 = 0.726），而对于给定的终点，hBNN方法产生了统计学上更显著的模型（R2 = 0.826）。与BMLR方法相比，hBNN方法能够将不同的描述符与毒性联系起来。两个模型都用外部预测集进行了验证。对模型中的描述符进行了分析和讨论。