Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada , Ensenada , Baja California , México.
Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional, Colegio de Ciencias de la Salud , Escuela de Medicina, Edificio de Especialidades Médicas , Quito , Pichincha , Ecuador.
Chem Res Toxicol. 2019 Jun 17;32(6):1178-1192. doi: 10.1021/acs.chemrestox.9b00011. Epub 2019 May 17.
Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with R ≥ 0.75 ( R = correlation coefficient) and MAE ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAE = 0.4044, MAE = 0.4067 and MAE = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAE = 0.3992 and MAE = 0.2286, and MAE = 0.3773 and MAE = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).
定量构效关系 (QSAR) 被引入以预测急性口服毒性 (AOT),使用 QuBiLS-MAS(基于图论电子密度矩阵和原子权重的二次、双线性和 N-线性映射的缩写)框架进行分子编码。使用了三个训练集来构建模型:EPA 训练集(5931 种化合物)、EPA 全训练集(7413 种化合物)和 Zhu 训练集(10152 种化合物)。此外,还使用了 EPA 测试集(1482 种化合物)来验证基于 EPA 训练集构建的 QSAR 模型,而 ProTox(425 种化合物)和 T3DB(284 种化合物)外部集则用于评估所有模型。k-最近邻、多层感知机、随机森林和支持向量机程序被用于构建多个基础(单个)模型。保留了 R≥0.75(R=相关系数)和 MAE≤0.5(MAE=平均绝对误差)的基础模型来构建共识模型。结果,基于最小运算符并表示为 M19 和 M22 的两个共识模型,以及基于加权平均值运算符并表示为 M24 的共识模型,被选为每个所考虑的训练集的最佳模型。根据执行的适用性域 (AD) 分析,模型 M19(基于 EPA 训练集构建)在 EPA 测试集、ProTox 外部集和 T3DB 外部集上的 MAE 分别为 0.4044、0.4067 和 0.2586;而模型 M22(基于 EPA 全集构建)和模型 M24(基于 Zhu 集构建)的 MAE 分别为 0.3992 和 MAE 为 0.2286,MAE 分别为 0.3773 和 MAE 为 0.2471。这些结果与文献中的 14 种 QSAR 方法(例如,admetSAR、ProTox-II)进行了比较和统计学验证。结果表明,模型 M22 表现出最佳的整体性能。此外,还对 261 种因毒性/副作用而被撤回的药物进行了回顾性研究,以评估前瞻性使用拟议的 QSAR 模型在化学品标签中的有用性。还与文献中的方法进行了比较。结果表明,根据全球化学品统一分类和标签制度,模型 M22 具有将化合物标记为有毒的最佳能力。因此,可以得出结论,所提出的模型,特别是模型 M22,构成了研究 AOT 的重要工具,在所有检查的方法中提供了最佳的结果。还开发了一个免费的软件用于虚拟筛选任务 (http://tomocomd.com/apps/ptoxra)。