Wang Yukun, Chen Xuebo
School of Chemical Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China
School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China
RSC Adv. 2020 Jun 4;10(36):21292-21308. doi: 10.1039/d0ra02701d. eCollection 2020 Jun 2.
Acute toxicity of the fathead minnow () is an important indicator to evaluate the hazards and risks of compounds in aquatic environments. The aim of our study is to explore the predictive power of the quantitative structure-activity relationship (QSAR) model based on a radial basis function (RBF) neural network with the joint optimization method to study the acute toxicity mechanism, and to develop a potential acute toxicity prediction model, for fathead minnow. To ensure the symmetry and fairness of the data splitting and to generate multiple chemically diverse training and validation sets, we used a self-organizing mapping (SOM) neural network to split the modeling dataset (containing 955 compounds) characterized by PaDEL-descriptors. After preliminary selection of descriptors the mean decrease impurity method, a hybrid quantum particle swarm optimization (HQPSO) algorithm was used to jointly optimize the parameters of RBF and select the key descriptors. We established 20 RBF-based QSAR models, and the statistical results showed that the 10-fold cross-validation results ( ) and the adjusted coefficients of determination ( ) were all great than 0.7 and 0.8, respectively. The of these models was between 0.6480 and 0.7317, and the was between 0.6563 and 0.7318. Combined with the frequency and importance of the descriptors used in RBF-based models, and the correlation between the descriptors and acute toxicity, we concluded that the water distribution coefficient, molar refractivity, and first ionization potential are important factors affecting the acute toxicity of fathead minnow. A consensus QSAR model with RBF-based models was established; this model showed good performance with = 0.9118, = 0.7632, and = 0.7430. A frequency weighted and distance (FWD)-based application domain (AD) definition method was proposed, and the outliers were analyzed carefully. Compared with previous studies the method proposed in this paper has obvious advantages and its robustness and external predictive power are also better than Xgboost-based model. It is an effective QSAR modeling method.
黑头呆鱼的急性毒性是评估水生环境中化合物危害和风险的重要指标。我们研究的目的是探索基于径向基函数(RBF)神经网络的定量构效关系(QSAR)模型的预测能力,采用联合优化方法研究急性毒性机制,并开发一个潜在的黑头呆鱼急性毒性预测模型。为确保数据拆分的对称性和公平性,并生成多个化学性质不同的训练集和验证集,我们使用自组织映射(SOM)神经网络对以PaDEL描述符为特征的建模数据集(包含955种化合物)进行拆分。在通过平均减少杂质法对描述符进行初步筛选后,使用混合量子粒子群优化(HQPSO)算法联合优化RBF的参数并选择关键描述符。我们建立了20个基于RBF的QSAR模型,统计结果表明,10倍交叉验证结果( )和调整后的决定系数( )分别均大于0.7和0.8。这些模型的 在0.6480至0.7317之间, 在0.6563至0.7318之间。结合基于RBF的模型中使用的描述符的频率和重要性,以及描述符与急性毒性之间的相关性,我们得出结论,水分配系数、摩尔折射率和第一电离势是影响黑头呆鱼急性毒性的重要因素。建立了一个基于RBF模型的共识QSAR模型;该模型表现良好, = 0.9118, = 0.7632, = 0.7430。提出了一种基于频率加权和距离(FWD)的应用域(AD)定义方法,并对异常值进行了仔细分析。与先前的研究相比,本文提出的方法具有明显优势,其稳健性和外部预测能力也优于基于Xgboost的模型。它是一种有效的QSAR建模方法。