• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器学习模型提高初级和最终生物降解速率的预测和理解。

Improving predictions and understanding of primary and ultimate biodegradation rates with machine learning models.

机构信息

School of Environment and Energy, South China University of Technology, Guangzhou, Guangdong 510006, People's Republic of China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou, Guangdong 510006, People's Republic of China.

School of Environment and Energy, South China University of Technology, Guangzhou, Guangdong 510006, People's Republic of China; The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, South China University of Technology, Guangzhou, Guangdong 510006, People's Republic of China.

出版信息

Sci Total Environ. 2023 Dec 15;904:166623. doi: 10.1016/j.scitotenv.2023.166623. Epub 2023 Aug 29.

DOI:10.1016/j.scitotenv.2023.166623
PMID:37652371
Abstract

This study aimed to develop machine learning based quantitative structure biodegradability relationship (QSBR) models for predicting primary and ultimate biodegradation rates of organic chemicals, which are essential parameters for environmental risk assessment. For this purpose, experimental primary and ultimate biodegradation rates of high consistency were compiled for 173 organic compounds. A significant number of descriptors were calculated with a collection of quantum/computational chemistry software and tools to achieve comprehensive representation and interpretability. Following a pre-screening process, multiple QSBR models were developed for both primary and ultimate endpoints using three algorithms: extreme gradient boosting (XGBoost), support vector machine (SVM), and multiple linear regression (MLR). Furthermore, a unified QSBR model was constructed using the knowledge transfer technique and XGBoost. Results demonstrated that all QSBR models developed in this study had good performance. Particularly, SVM models exhibited high level of goodness of fit (coefficient of determination on the training set of 0.973 for primary and 0.980 for ultimate), robustness (leave-one-out cross-validated coefficient of 0.953 for primary and 0.967 for ultimate), and external predictive ability (external explained variance of 0.947 for primary and 0.958 for ultimate). The knowledge transfer technique enhanced model performance by learning from properties of two biodegradation endpoints. Williams plots were used to visualize the application domains of the models. Through SHapley Additive exPlanations (SHAP) analysis, this study identified key features affecting biodegradation rates. Notably, MDEO-12, APC2D1_C_O, and other features contributed to primary biodegradation, while AATS0v, AATS2v, and others inhibited it. For ultimate biodegradation, features like No. of Rotatable Bonds, APC2D1_C_O, and minHBa were contributors, while C1SP3, Halogen Ratio, GGI4, and others hindered the process. Also, the study quantified the contributions of each feature in predictions for individual chemicals. This research provides valuable tools for predicting both primary and ultimate biodegradation rates while offering insights into the mechanisms.

摘要

本研究旨在开发基于机器学习的定量构效生物降解关系(QSBR)模型,用于预测有机化合物的初级和最终生物降解率,这是环境风险评估的重要参数。为此,我们为 173 种有机化合物编制了高浓度的实验初级和最终生物降解率。使用一系列量子/计算化学软件和工具计算了大量描述符,以实现全面的表示和可解释性。在预筛选过程之后,使用三种算法:极端梯度增强(XGBoost)、支持向量机(SVM)和多元线性回归(MLR),为初级和最终终点开发了多个 QSBR 模型。此外,使用知识转移技术和 XGBoost 构建了一个统一的 QSBR 模型。结果表明,本研究中开发的所有 QSBR 模型都具有良好的性能。特别是,SVM 模型表现出很高的拟合度(训练集上的决定系数为 0.973 用于初级,0.980 用于最终)、稳健性(用于初级的留一交叉验证系数为 0.953,用于最终的为 0.967)和外部预测能力(初级的外部解释方差为 0.947,最终的为 0.958)。知识转移技术通过从两个生物降解终点的性质中学习来提高模型性能。Williams 图用于可视化模型的应用领域。通过 SHapley Additive exPlanations(SHAP)分析,本研究确定了影响生物降解率的关键特征。值得注意的是,MDEO-12、APC2D1_C_O 和其他特征有助于初级生物降解,而 AATS0v、AATS2v 和其他特征则抑制了它。对于最终生物降解,特征如旋转键的数量、APC2D1_C_O 和 minHBa 是贡献者,而 C1SP3、卤素比、GGI4 和其他特征则阻碍了这个过程。此外,本研究还量化了每个特征在个别化学品预测中的贡献。本研究为预测初级和最终生物降解率提供了有价值的工具,并提供了对机制的深入了解。

相似文献

1
Improving predictions and understanding of primary and ultimate biodegradation rates with machine learning models.利用机器学习模型提高初级和最终生物降解速率的预测和理解。
Sci Total Environ. 2023 Dec 15;904:166623. doi: 10.1016/j.scitotenv.2023.166623. Epub 2023 Aug 29.
2
Development of models predicting biodegradation rate rating with multiple linear regression and support vector machine algorithms.采用多元线性回归和支持向量机算法开发预测生物降解率等级的模型。
Chemosphere. 2020 Aug;253:126666. doi: 10.1016/j.chemosphere.2020.126666. Epub 2020 Apr 4.
3
A quantitative structure-biodegradation relationship (QSBR) approach to predict biodegradation rates of aromatic chemicals.定量构效-生物降解关系(QSBR)方法预测芳香族化学品的生物降解速率。
Water Res. 2019 Jun 15;157:181-190. doi: 10.1016/j.watres.2019.03.086. Epub 2019 Mar 28.
4
Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets.我们是否需要不同的机器学习算法来进行定量构效关系建模?对 16 种机器学习算法在 14 个定量构效关系数据集上的综合评估。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa321.
5
A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型:机器学习研究。
J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.
6
Application of machine learning techniques for predicting survival in ovarian cancer.机器学习技术在卵巢癌生存预测中的应用。
BMC Med Inform Decis Mak. 2022 Dec 30;22(1):345. doi: 10.1186/s12911-022-02087-y.
7
Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database.应用基于中国数据库开发的三分类机器学习模型对美国、欧盟和世界卫生组织的危险有机化学品进行致癌性预测。
Ecotoxicol Environ Saf. 2023 Apr 15;255:114806. doi: 10.1016/j.ecoenv.2023.114806. Epub 2023 Mar 20.
8
Meta-Analysis and Machine Learning Models for Anaerobic Biodegradation Rates of Organic Contaminants in Sediments and Sludge.用于沉积物和污泥中有机污染物厌氧生物降解速率的荟萃分析和机器学习模型。
Environ Sci Technol. 2024 Jul 23;58(29):12976-12988. doi: 10.1021/acs.est.4c01033. Epub 2024 Jul 10.
9
A review of structure-based biodegradation estimation methods.基于结构的生物降解估算方法综述。
J Hazard Mater. 2001 Jun 29;84(2-3):189-215. doi: 10.1016/s0304-3894(01)00207-2.
10
Modeling adsorption of organic pollutants onto single-walled carbon nanotubes with theoretical molecular descriptors using MLR and SVM algorithms.采用 MLR 和 SVM 算法,用理论分子描述符对单壁碳纳米管上有机污染物的吸附进行建模。
Chemosphere. 2019 Jan;214:79-84. doi: 10.1016/j.chemosphere.2018.09.074. Epub 2018 Sep 18.

引用本文的文献

1
Computational Profiling of Monoterpenoid Phytochemicals: Insights for Medicinal Chemistry and Drug Design Strategies.单萜类植物化学物质的计算分析:对药物化学和药物设计策略的见解
Int J Mol Sci. 2025 Aug 8;26(16):7671. doi: 10.3390/ijms26167671.
2
Identifying emphysema risk using brominated flame retardants exposure: a machine learning predictive model based on the SHAP methodology.利用溴化阻燃剂暴露情况识别肺气肿风险:基于SHAP方法的机器学习预测模型
Front Public Health. 2025 Jun 25;13:1600729. doi: 10.3389/fpubh.2025.1600729. eCollection 2025.