• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于预测卤素自由基对水性有机化学品反应活性的机器学习

Machine learning for predicting halogen radical reactivity toward aqueous organic chemicals.

作者信息

Liang Youheng, Huangfu Xiaoliu, Huang Ruixing, Han Zhenpeng, Wu Sisi, Wang Jingrui, Long Xinlong, Ma Jun, He Qiang

机构信息

Key Laboratory of Eco-Environments in Three Gorges Reservoir Region, Ministry of Education, College of Environment, and Ecology, Chongqing University, Chongqing 400044, China.

Key Laboratory of Eco-Environments in Three Gorges Reservoir Region, Ministry of Education, College of Environment, and Ecology, Chongqing University, Chongqing 400044, China.

出版信息

J Hazard Mater. 2024 Jul 5;472:134501. doi: 10.1016/j.jhazmat.2024.134501. Epub 2024 May 6.

DOI:10.1016/j.jhazmat.2024.134501
PMID:38735182
Abstract

Rapid advances in machine learning (ML) provide fast, accurate, and widely applicable methods for predicting free radical-mediated organic pollutant reactivity. In this study, the rate constants (logk) of four halogen radicals were predicted using Morgan fingerprint (MF) and Mordred descriptor (MD) in combination with a series of ML models. The findings highlighted that making accurate predictions for various datasets depended on an effective combination of descriptors and algorithms. To further alleviate the challenge of limited sample size, we introduced a data combination strategy that improved prediction accuracy and mitigated overfitting by combining different datasets. The Light Gradient Boosting Machine (LightGBM) with MF and Random Forest (RF) with MD models based on the unified dataset were finally selected as the optimal models. The SHapley Additive exPlanations revealed insights: the MF-LightGBM model successfully captured the influence of electron-withdrawing/donating groups, while autocorrelation, walk count and information content descriptors in the MD-RF model were identified as key features. Furthermore, the important contribution of pH was emphasized. The results of the applicability domain analysis further supported that the developed model can make reliable predictions for query compounds across a broader range. Finally, a practical web application for logk calculations was built.

摘要

机器学习(ML)的快速发展为预测自由基介导的有机污染物反应性提供了快速、准确且广泛适用的方法。在本研究中,结合一系列ML模型,使用摩根指纹(MF)和莫德雷德描述符(MD)预测了四种卤素自由基的速率常数(logk)。研究结果突出表明,对各种数据集进行准确预测取决于描述符和算法的有效结合。为了进一步缓解样本量有限的挑战,我们引入了一种数据组合策略,通过组合不同数据集提高了预测准确性并减轻了过拟合。最终,基于统一数据集的带有MF的轻梯度提升机(LightGBM)和带有MD的随机森林(RF)模型被选为最优模型。SHapley加性解释揭示了一些见解:MF-LightGBM模型成功捕捉了吸电子/供电子基团的影响,而MD-RF模型中的自相关、游走计数和信息内容描述符被确定为关键特征。此外,强调了pH的重要贡献。适用域分析结果进一步支持所开发的模型能够对更广泛范围内的查询化合物做出可靠预测。最后,构建了一个用于logk计算的实用网络应用程序。

相似文献

1
Machine learning for predicting halogen radical reactivity toward aqueous organic chemicals.用于预测卤素自由基对水性有机化学品反应活性的机器学习
J Hazard Mater. 2024 Jul 5;472:134501. doi: 10.1016/j.jhazmat.2024.134501. Epub 2024 May 6.
2
Machine learning approaches to predict the apparent rate constants for aqueous organic compounds by ferrate.通过高铁酸盐预测水中有机化合物表观速率常数的机器学习方法。
J Environ Manage. 2023 Mar 1;329:116904. doi: 10.1016/j.jenvman.2022.116904. Epub 2022 Dec 16.
3
Application of machine learning and deep learning methods for hydrated electron rate constant prediction.机器学习和深度学习方法在水合电子速率常数预测中的应用。
Environ Res. 2023 Aug 15;231(Pt 1):115996. doi: 10.1016/j.envres.2023.115996. Epub 2023 Apr 25.
4
A new perspective on predicting the reaction rate constants of hydrated electrons for organic contaminants: Exploring molecular structure characterization methods and ambient conditions.预测有机污染物水合电子反应速率常数的新视角:探索分子结构表征方法和环境条件
Sci Total Environ. 2023 Dec 15;904:166316. doi: 10.1016/j.scitotenv.2023.166316. Epub 2023 Aug 15.
5
Prediction of free radical reactions toward organic pollutants with easily accessible molecular descriptors.利用易于获取的分子描述符预测自由基对有机污染物的反应
Chemosphere. 2024 Jan;346:140660. doi: 10.1016/j.chemosphere.2023.140660. Epub 2023 Nov 9.
6
Predicting adsorption of organic compounds onto graphene and black phosphorus by molecular dynamics and machine learning.通过分子动力学和机器学习预测有机化合物在石墨烯和黑磷上的吸附。
Environ Sci Pollut Res Int. 2023 Oct;30(50):108846-108854. doi: 10.1007/s11356-023-29962-z. Epub 2023 Sep 27.
7
Prediction of organic compound aqueous solubility using machine learning: a comparison study of descriptor-based and fingerprints-based models.使用机器学习预测有机化合物的水溶性:基于描述符和基于指纹的模型的比较研究
J Cheminform. 2023 Oct 18;15(1):99. doi: 10.1186/s13321-023-00752-6.
8
Predicting reactivity dynamics of halogen species and trace organic contaminants using machine learning models.使用机器学习模型预测卤代物种和痕量有机污染物的反应动力学。
Chemosphere. 2024 Jan;346:140659. doi: 10.1016/j.chemosphere.2023.140659. Epub 2023 Nov 9.
9
Improved GNNs for Log  Prediction by Transferring Knowledge from Low-Fidelity Data.通过从低质量数据转移知识来改进图神经网络进行日志预测。
J Chem Inf Model. 2023 Apr 24;63(8):2345-2359. doi: 10.1021/acs.jcim.2c01564. Epub 2023 Mar 31.
10
Quantitative structure-property relationships for the calculation of the soil adsorption coefficient using machine learning algorithms with calculated chemical properties from open-source software.使用机器学习算法和开源软件计算的化学性质计算土壤吸附系数的定量结构-性质关系。
Environ Res. 2021 May;196:110363. doi: 10.1016/j.envres.2020.110363. Epub 2020 Oct 22.

引用本文的文献

1
A multi-dimensional computational framework of drug-induced hepatotoxicity: integrating molecular structure features with disease pathogenesis.药物性肝毒性的多维计算框架:将分子结构特征与疾病发病机制相结合。
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf456.