Suppr超能文献

基于机器学习和集成方法的有机化合物对水生生物的生物浓缩因子和毒性的定量构效关系建模研究。

QSAR modelling study of the bioconcentration factor and toxicity of organic compounds to aquatic organisms using machine learning and ensemble methods.

机构信息

Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Shenyang, China; Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China; School of Life Science, Liaoning University, Shenyang, 110036, China.

School of Life Science, Liaoning University, Shenyang, 110036, China.

出版信息

Ecotoxicol Environ Saf. 2019 Sep 15;179:71-78. doi: 10.1016/j.ecoenv.2019.04.035. Epub 2019 Apr 23.

Abstract

Bioconcentration factors and median lethal concentrations (LCs) are important when assessing risks posed by organic pollutants to aquatic ecosystems. Various quantitative structure-activity relationship models have been developed to predict bioconcentration factors and classify acute toxicity. In the study, we developed a regression model using Recursive Feature Elimination (RFE) method combined with the Support Vector Machine (SVM) algorithm. We calculated 2D molecular descriptors from a dataset containing 450 diverse chemicals in our regression model. Then we built three ensemble models using three machine learning algorithms and calculated 12 molecular fingerprints from a dataset containing 400 diverse chemicals in our classification models. In the regression model, the R and R for the regression model were 0.860 and 0.757, respectively. Other parameters indicated that the regression model made good predictions and could efficiently predict a new set of compounds following standards set by Golbraikh, Tropsha, and Roy. In the classification models, the ensemble-SVM classification model gave an overall accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve of 92.2, 95.1, 86.0, and 0.965, respectively, in a five-fold cross-validation and of 87.3, 92.6, 76.0, and 0.940, respectively, in an external validation. These parameters indicated that our ensemble-SVM model was more stable and gave more accurate predictions than previous models. The model could therefore be used to effectively predict aquatic toxicity and assess risks posed to aquatic ecosystems. We identified several structures most relevant to acute aquatic toxicity through predictions made by the two types of models, and this information may be important to aquatic toxicology experiments and aquatic system management.

摘要

当评估有机污染物对水生生态系统造成的风险时,生物浓缩因子和半数致死浓度 (LC) 非常重要。已经开发了各种定量结构-活性关系模型来预测生物浓缩因子和分类急性毒性。在这项研究中,我们使用递归特征消除 (RFE) 方法结合支持向量机 (SVM) 算法开发了一个回归模型。我们从包含 450 种不同化学物质的数据集计算 2D 分子描述符,然后在我们的分类模型中,从包含 400 种不同化学物质的数据集计算三个机器学习算法的三个集成模型和 12 个分子指纹。在回归模型中,回归模型的 R 和 R 分别为 0.860 和 0.757。其他参数表明,回归模型进行了良好的预测,并且可以根据 Golbraikh、Tropsha 和 Roy 设定的标准有效地预测一组新的化合物。在分类模型中,集成-SVM 分类模型在五重交叉验证中的总体准确率、敏感度、特异性和接收器操作特性曲线下面积分别为 92.2%、95.1%、86.0%和 0.965%,在外部验证中的分别为 87.3%、92.6%、76.0%和 0.940%。这些参数表明,我们的集成-SVM 模型比以前的模型更稳定,并且给出了更准确的预测。因此,该模型可用于有效预测水生毒性并评估对水生生态系统构成的风险。我们通过两种类型的模型预测确定了与急性水生毒性最相关的几种结构,这些信息可能对水生毒理学实验和水生系统管理很重要。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验