• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于混合决策树的短期水质预测机器学习模型。

Hybrid decision tree-based machine learning models for short-term water quality prediction.

机构信息

State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation, Southwest Petroleum University, Chengdu, 610500, China; Trenchless Technology Center, Louisiana Tech University, Ruston, LA, 71270, United States.

School of Science, Southwest University of Science and Technology, Mianyang, 621010, China.

出版信息

Chemosphere. 2020 Jun;249:126169. doi: 10.1016/j.chemosphere.2020.126169. Epub 2020 Feb 11.

DOI:10.1016/j.chemosphere.2020.126169
PMID:32078849
Abstract

Water resources are the foundation of people's life and economic development, and are closely related to health and the environment. Accurate prediction of water quality is the key to improving water management and pollution control. In this paper, two novel hybrid decision tree-based machine learning models are proposed to obtain more accurate short-term water quality prediction results. The basic models of the two hybrid models are extreme gradient boosting (XGBoost) and random forest (RF), which respectively introduce an advanced data denoising technique - complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN). Taking the water resources of Gales Creek site in Tualatin River (one of the most polluted rivers in the world) Basin as an example, a total of 1875 data (hourly data) from May 1, 2019 to July 20, 2019 are collected. Two hybrid models are used to predict six water quality indicators, including water temperature, dissolved oxygen, pH value, specific conductance, turbidity, and fluorescent dissolved organic matter. Six error metrics are introduced as the basis of performance evaluation, and the results of the two models are compared with the other four conventional models. The results reveal that: (1) CEEMDAN-RF performs best in the prediction of temperature, dissolved oxygen and specific conductance, the mean absolute percentage errors (MAPEs) are 0.69%, 1.05%, and 0.90%, respectively. CEEMDAN-XGBoost performs best in the prediction of pH value, turbidity, and fluorescent dissolved organic matter, the MAPEs are 0.27%, 14.94%, and 1.59%, respectively. (2) The average MAPEs of CEEMDAN-RF and CEEMMDAN-XGBoost models are the smallest, which are 3.90% and 3.71% respectively, indicating that their overall prediction performance is the best. In addition, the stability of the prediction model is also discussed in this paper. The analysis shows that the prediction stability of CEEMDAN-RF and CEEMDAN-XGBoost is higher than other benchmark models.

摘要

水资源是人类生活和经济发展的基础,与健康和环境密切相关。准确预测水质是改善水资源管理和污染控制的关键。本文提出了两种基于决策树的新型混合机器学习模型,以获得更准确的短期水质预测结果。这两种混合模型的基本模型分别是极端梯度提升 (XGBoost) 和随机森林 (RF),它们分别引入了一种先进的数据去噪技术——完全集成经验模态分解自适应噪声 (CEEMDAN)。以图拉丁河流域盖尔斯溪站点的水资源为例,共采集了 2019 年 5 月 1 日至 7 月 20 日期间的 1875 个数据(每小时数据)。使用两种混合模型预测了包括水温、溶解氧、pH 值、电导率、浊度和荧光溶解有机物在内的六个水质指标。引入了六个误差指标作为性能评估的基础,并将两种模型的结果与其他四个传统模型进行了比较。结果表明:(1)CEEMDAN-RF 在水温、溶解氧和电导率的预测中表现最好,平均绝对百分比误差 (MAPE) 分别为 0.69%、1.05%和 0.90%。CEEMDAN-XGBoost 在 pH 值、浊度和荧光溶解有机物的预测中表现最好,MAPE 分别为 0.27%、14.94%和 1.59%。(2)CEEMDAN-RF 和 CEEMDAN-XGBoost 模型的平均 MAPE 最小,分别为 3.90%和 3.71%,表明它们的整体预测性能最好。此外,本文还讨论了预测模型的稳定性。分析表明,CEEMDAN-RF 和 CEEMDAN-XGBoost 的预测稳定性高于其他基准模型。

相似文献

1
Hybrid decision tree-based machine learning models for short-term water quality prediction.基于混合决策树的短期水质预测机器学习模型。
Chemosphere. 2020 Jun;249:126169. doi: 10.1016/j.chemosphere.2020.126169. Epub 2020 Feb 11.
2
Comparison of the performance of decision tree (DT) algorithms and extreme learning machine (ELM) model in the prediction of water quality of the Upper Green River watershed.决策树(DT)算法和极限学习机(ELM)模型在预测上格林河流域水质方面的性能比较。
Water Environ Res. 2021 Nov;93(11):2360-2373. doi: 10.1002/wer.1642. Epub 2021 Oct 4.
3
Dynamic real-time forecasting technique for reclaimed water volumes in urban river environmental management.城市河流水环境管理中再生水水量的动态实时预测技术。
Environ Res. 2024 May 1;248:118267. doi: 10.1016/j.envres.2024.118267. Epub 2024 Jan 18.
4
Prediction of 5-day biochemical oxygen demand in the Buriganga River of Bangladesh using novel hybrid machine learning algorithms.利用新型混合机器学习算法预测孟加拉国布里甘加河的五日生化需氧量。
Water Environ Res. 2022 May;94(5):e10718. doi: 10.1002/wer.10718.
5
Determination of biochemical oxygen demand and dissolved oxygen for semi-arid river environment: application of soft computing models.半干旱河流环境生化需氧量和溶解氧的测定:软计算模型的应用。
Environ Sci Pollut Res Int. 2019 Jan;26(1):923-937. doi: 10.1007/s11356-018-3663-x. Epub 2018 Nov 12.
6
Design and implementation of a hybrid model based on two-layer decomposition method coupled with extreme learning machines to support real-time environmental monitoring of water quality parameters.基于两层分解方法与极限学习机相结合的混合模型的设计与实现,以支持水质参数的实时环境监测。
Sci Total Environ. 2019 Jan 15;648:839-853. doi: 10.1016/j.scitotenv.2018.08.221. Epub 2018 Aug 18.
7
Extreme learning machines: a new approach for modeling dissolved oxygen (DO) concentration with and without water quality variables as predictors.极限学习机:一种以水质变量作为预测因子或不使用水质变量来建模溶解氧(DO)浓度的新方法。
Environ Sci Pollut Res Int. 2017 Jul;24(20):16702-16724. doi: 10.1007/s11356-017-9283-z. Epub 2017 May 30.
8
A hybrid prediction model of dissolved oxygen concentration based on secondary decomposition and bidirectional gate recurrent unit.基于二次分解和双向门循环单元的溶解氧浓度混合预测模型。
Environ Geochem Health. 2024 Mar 14;46(4):127. doi: 10.1007/s10653-024-01884-w.
9
An improved framework to predict river flow time series data.一种用于预测河流流量时间序列数据的改进框架。
PeerJ. 2019 Jul 1;7:e7183. doi: 10.7717/peerj.7183. eCollection 2019.
10
Assessment and prediction of Water Quality Index (WQI) by seasonal key water parameters in a coastal city: application of machine learning models.沿海城市水质指数(WQI)的季节性关键水质参数评估与预测:机器学习模型的应用。
Environ Monit Assess. 2024 Oct 3;196(11):1008. doi: 10.1007/s10661-024-13209-6.

引用本文的文献

1
Exploring the impact of landscape environments on tourists' emotional fluctuations in Fujian's Coastal National Parks using machine learning.利用机器学习探索福建沿海国家公园景观环境对游客情绪波动的影响。
PLoS One. 2025 Aug 13;20(8):e0329118. doi: 10.1371/journal.pone.0329118. eCollection 2025.
2
Time series AQI forecasting using Kalman-integrated Bi-GRU and Chi-square divergence optimization.使用卡尔曼集成双向门控循环单元和卡方散度优化的时间序列空气质量指数预测
Sci Rep. 2025 Aug 9;15(1):29157. doi: 10.1038/s41598-025-12422-8.
3
Multi-dimensional water quality indicators forecasting from IoT sensors: A tensor decomposition and multi-head self-attention mechanism.
基于物联网传感器的多维水质指标预测:张量分解与多头自注意力机制
PLoS One. 2025 Jul 11;20(7):e0326870. doi: 10.1371/journal.pone.0326870. eCollection 2025.
4
A Deep Learning Algorithm for Multi-Source Data Fusion to Predict Effluent Quality of Wastewater Treatment Plant.一种用于多源数据融合以预测污水处理厂出水水质的深度学习算法。
Toxics. 2025 Apr 27;13(5):349. doi: 10.3390/toxics13050349.
5
Forecasting monthly runoff in a glacierized catchment: A comparison of extreme gradient boosting (XGBoost) and deep learning models.预测冰川集水区的月径流量:极端梯度提升(XGBoost)与深度学习模型的比较
PLoS One. 2025 May 23;20(5):e0321008. doi: 10.1371/journal.pone.0321008. eCollection 2025.
6
A novel water quality risk assessment framework for reservoir water bodies coupling key parameter selection and dynamic warning threshold determination.一种耦合关键参数选择与动态预警阈值确定的水库水体水质风险评估新框架。
Sci Rep. 2025 Apr 24;15(1):14377. doi: 10.1038/s41598-025-98197-4.
7
Water resource utilization and future supply-demand scenarios in energy cities of semi-arid regions.半干旱地区能源城市的水资源利用与未来供需情景
Sci Rep. 2025 Feb 11;15(1):5005. doi: 10.1038/s41598-025-85458-5.
8
A machine learning-assisted study of the formation of oxygen vacancies in anatase titanium dioxide.一项关于锐钛矿型二氧化钛中氧空位形成的机器学习辅助研究。
RSC Adv. 2024 Oct 21;14(45):33198-33205. doi: 10.1039/d4ra04422c. eCollection 2024 Oct 17.
9
A hybrid model of ARIMA and MLP with a Grasshopper optimization algorithm for time series forecasting of water quality.一种结合自回归积分滑动平均模型(ARIMA)和多层感知器(MLP)并采用蚱蜢优化算法的混合模型,用于水质时间序列预测。
Sci Rep. 2024 Oct 13;14(1):23927. doi: 10.1038/s41598-024-74144-7.
10
Machine learning predictive insight of water pollution and groundwater quality in the Eastern Province of Saudi Arabia.沙特阿拉伯东部省份水污染与地下水质量的机器学习预测洞察
Sci Rep. 2024 Aug 28;14(1):20031. doi: 10.1038/s41598-024-70610-4.