Eldred D V, Weikel C L, Jurs P C, Kaiser K L
Department of Chemistry, 152 Davey Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
Chem Res Toxicol. 1999 Jul;12(7):670-8. doi: 10.1021/tx980273w.
Interest in the prediction of toxicity without the use of experimental data is growing, and quantitative structure-activity relationship (QSAR) methods are valuable for such predictions. A QSAR study of acute aqueous toxicity of 375 diverse organic compounds has been developed using only calculated structural features as independent variables. Toxicity is expressed as -log(LD(50)) with the units -log(millimoles per liter) and ranges from -3 to 6. Multiple linear regression and computational neural networks (CNNs) are utilized for model building. The best model is a nonlinear CNN model based on eight calculated molecular structure descriptors. The root-mean-square log(LD(50)) errors for the training, cross-validation, and prediction sets of this CNN model are 0.71, 0.77, and 0.74 -log(mmol/L), respectively. These results are compared to a previous study with the same data set which included many more descriptors and used experimental data in the descriptor pool.
不使用实验数据来预测毒性的研究兴趣日益增长,定量构效关系(QSAR)方法对于此类预测很有价值。仅使用计算得到的结构特征作为自变量,开展了一项对375种不同有机化合物的急性水相毒性的QSAR研究。毒性表示为-log(LD(50)),单位为-log(毫摩尔/升),范围从-3到6。多元线性回归和计算神经网络(CNN)被用于模型构建。最佳模型是基于八个计算得到的分子结构描述符的非线性CNN模型。该CNN模型的训练集、交叉验证集和预测集的均方根log(LD(50))误差分别为0.71、0.77和0.74 -log(mmol/L)。将这些结果与之前对同一数据集的研究进行了比较,之前的研究包含更多描述符,并在描述符库中使用了实验数据。