比较个体和集成机器学习模型对未经处理和处理酸性矿山排水中硫酸盐水平的预测。

Comparison of individual and ensemble machine learning models for prediction of sulphate levels in untreated and treated Acid Mine Drainage.

机构信息

Molecular Sciences Institute, School of Chemistry, University of the Witwatersrand, Private Bag X3, Johannesburg, 2050, South Africa.

Pharmacy Department, School of Healthcare Sciences, University of Limpopo, Turfloop Campus, Polokwane, 0727, South Africa.

出版信息

Environ Monit Assess. 2024 Mar 2;196(4):332. doi: 10.1007/s10661-024-12467-8.

DOI:10.1007/s10661-024-12467-8

PMID:38429461

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10907470/

Abstract

Machine learning was used to provide data for further evaluation of potential extraction of octathiocane (S), a commercially useful by-product, from Acid Mine Drainage (AMD) by predicting sulphate levels in an AMD water quality dataset. Individual ML regressor models, namely: Linear Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO), Ridge (RD), Elastic Net (EN), K-Nearest Neighbours (KNN), Support Vector Regression (SVR), Decision Tree (DT), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Multi-Layer Perceptron Artificial Neural Network (MLP) and Stacking Ensemble (SE-ML) combinations of these models were successfully used to predict sulphate levels. A SE-ML regressor trained on untreated AMD which stacked seven of the best-performing individual models and fed them to a LR meta-learner model was found to be the best-performing model with a Mean Squared Error (MSE) of 0.000011, Mean Absolute Error (MAE) of 0.002617 and R of 0.9997. Temperature (°C), Total Dissolved Solids (mg/L) and, importantly, iron (mg/L) were highly correlated to sulphate (mg/L) with iron showing a strong positive linear correlation that indicated dissolved products from pyrite oxidation. Ensemble learning (bagging, boosting and stacking) outperformed individual methods due to their combined predictive accuracies. Surprisingly, when comparing SE-ML that combined all models with SE-ML that combined only the best-performing models, there was only a slight difference in model accuracies which indicated that including bad-performing models in the stack had no adverse effect on its predictive performance.

摘要

机器学习被用于提供数据，以进一步评估从酸性矿山排水（AMD）中潜在提取商业上有用的八硫杂环辛烷（S）的可能性，方法是预测 AMD 水质数据集的硫酸盐水平。使用了单个 ML 回归模型，即：线性回归（LR）、最小绝对值收缩和选择算子（LASSO）、岭回归（RD）、弹性网络（EN）、K-最近邻（KNN）、支持向量回归（SVR）、决策树（DT）、极端梯度提升（XGBoost）、随机森林（RF）、多层感知机人工神经网络（MLP）和这些模型的堆叠集成（SE-ML）组合，成功地用于预测硫酸盐水平。在未处理的 AMD 上训练的 SE-ML 回归器堆叠了七个表现最好的单个模型，并将它们馈送到 LR 元学习器模型中，发现它是表现最好的模型，其均方误差（MSE）为 0.000011，平均绝对误差（MAE）为 0.002617，R 为 0.9997。温度（°C）、总溶解固体（mg/L），以及重要的是铁（mg/L）与硫酸盐（mg/L）高度相关，铁表现出强烈的正线性相关，表明黄铁矿氧化的溶解产物。由于其综合预测准确性，集成学习（袋装、提升和堆叠）优于单个方法。令人惊讶的是，在比较将所有模型组合的 SE-ML 与仅将表现最好的模型组合的 SE-ML 时，模型准确性只有微小差异，这表明在堆叠中包含表现不佳的模型对其预测性能没有不利影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58bb/10907470/40823326ebde/10661_2024_12467_Fig1_HTML.jpg

相似文献

Comparison of individual and ensemble machine learning models for prediction of sulphate levels in untreated and treated Acid Mine Drainage.比较个体和集成机器学习模型对未经处理和处理酸性矿山排水中硫酸盐水平的预测。

Environ Monit Assess. 2024 Mar 2;196(4):332. doi: 10.1007/s10661-024-12467-8.

Predictive modeling of blood pressure during hemodialysis: a comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method.血液透析期间血压的预测建模：线性模型、随机森林、支持向量回归、XGBoost、LASSO回归及集成方法的比较

Comput Methods Programs Biomed. 2020 Oct;195:105536. doi: 10.1016/j.cmpb.2020.105536. Epub 2020 May 22.

Using machine learning models to predict the effects of seasonal fluxes on Plesiomonas shigelloides population density.使用机器学习模型预测季节性通量对类志贺邻单胞菌种群密度的影响。

Environ Pollut. 2023 Jan 15;317:120734. doi: 10.1016/j.envpol.2022.120734. Epub 2022 Nov 28.

A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型：机器学习研究。

J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.

Assessment and quantification of ovarian reserve on the basis of machine learning models.基于机器学习模型的卵巢储备评估和量化。

Front Endocrinol (Lausanne). 2023 Mar 15;14:1087429. doi: 10.3389/fendo.2023.1087429. eCollection 2023.

A Bayesian optimization tunning integrated multi-stacking classifier framework for the prediction of radiodermatitis from 4D-CT of patients underwent breast cancer radiotherapy.一种用于从接受乳腺癌放疗患者的4D-CT预测放射性皮炎的贝叶斯优化调谐集成多堆叠分类器框架。

Front Oncol. 2023 Jun 13;13:1152020. doi: 10.3389/fonc.2023.1152020. eCollection 2023.

Machine learning-based prediction of hospital prolonged length of stay admission at emergency department: a Gradient Boosting algorithm analysis.基于机器学习的急诊科住院时间延长预测：梯度提升算法分析

Front Artif Intell. 2023 Jul 28;6:1179226. doi: 10.3389/frai.2023.1179226. eCollection 2023.

Integrating deep learning and regression models for accurate prediction of groundwater fluoride contamination in old city in Bitlis province, Eastern Anatolia Region, Türkiye.利用深度学习和回归模型准确预测土耳其东安纳托利亚地区比特利斯省老城的地下水氟污染

Environ Sci Pollut Res Int. 2024 Jul;31(34):47201-47219. doi: 10.1007/s11356-024-34194-w. Epub 2024 Jul 11.

Robust machine learning algorithms for predicting coastal water quality index.用于预测沿海水质指数的稳健机器学习算法。

J Environ Manage. 2022 Nov 1;321:115923. doi: 10.1016/j.jenvman.2022.115923. Epub 2022 Aug 19.

Prediction of electron-solid interaction parameters using machine learning.使用机器学习预测电子与固体相互作用参数。

Med Phys. 2025 Jan;52(1):652-661. doi: 10.1002/mp.17445. Epub 2024 Oct 12.

引用本文的文献

An online explainable ensemble machine learning model for predicting epidermal growth factor receptor mutation status in lung adenocarcinoma.一种用于预测肺腺癌中表皮生长因子受体突变状态的在线可解释集成机器学习模型。

Transl Lung Cancer Res. 2025 Jul 31;14(7):2670-2687. doi: 10.21037/tlcr-2025-237. Epub 2025 Jul 28.

本文引用的文献

Machine learning-based prediction of toxic metals concentration in an acid mine drainage environment, northern Tunisia.基于机器学习对突尼斯北部酸性矿山排水环境中有毒金属浓度的预测

Environ Sci Pollut Res Int. 2022 Dec;29(58):87490-87508. doi: 10.1007/s11356-022-21890-8. Epub 2022 Jul 9.

Reflecting on twenty years of international agreements concerning water governance: insights and key learning.反思二十年来关于水治理的国际协定：见解与关键经验教训。

Int Environ Agreem. 2022;22(2):317-332. doi: 10.1007/s10784-022-09564-9. Epub 2022 Feb 19.

Machine Learning-Based Modeling of the Environmental Degradation, Institutional Quality, and Economic Growth.基于机器学习的环境退化、制度质量与经济增长建模

Environ Model Assess (Dordr). 2022;27(6):953-966. doi: 10.1007/s10666-021-09807-0. Epub 2021 Nov 24.

A Hybrid Neural Network-Particle Swarm Optimization Informed Spatial Interpolation Technique for Groundwater Quality Mapping in a Small Island Province of the Philippines.一种用于菲律宾小岛屿省份地下水水质制图的混合神经网络-粒子群优化智能空间插值技术。

Toxics. 2021 Oct 21;9(11):273. doi: 10.3390/toxics9110273.

Machine Learning: New Ideas and Tools in Environmental Science and Engineering.机器学习：环境科学与工程中的新思想和新工具。

Environ Sci Technol. 2021 Oct 5;55(19):12741-12754. doi: 10.1021/acs.est.1c01339. Epub 2021 Aug 17.

Microscopic Methods for Identification of Sulfate-Reducing Bacteria from Various Habitats.从各种生境中鉴定硫酸盐还原菌的微观方法。

Int J Mol Sci. 2021 Apr 13;22(8):4007. doi: 10.3390/ijms22084007.

A critical review on remediation, reuse, and resource recovery from acid mine drainage.酸性矿山排水的修复、再利用和资源回收的批判性回顾。

Environ Pollut. 2019 Apr;247:1110-1124. doi: 10.1016/j.envpol.2019.01.085. Epub 2019 Feb 6.

Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage chemistry: a probability bounds approach.不确定性量化和机器学习技术在预测酸岩排水化学中的集成：概率界限方法。

Sci Total Environ. 2014 Aug 15;490:182-90. doi: 10.1016/j.scitotenv.2014.04.125. Epub 2014 May 21.

Predicting copper concentrations in acid mine drainage: a comparative analysis of five machine learning techniques.预测酸性矿山排水中的铜浓度：五种机器学习技术的比较分析。

Environ Monit Assess. 2013 May;185(5):4171-82. doi: 10.1007/s10661-012-2859-7. Epub 2012 Sep 15.

Statistical validation of sulfate quantification methods used for analysis of acid mine drainage.用于酸性矿山排水分析的硫酸盐定量方法的统计验证。

Talanta. 2007 Jan 15;71(1):303-11. doi: 10.1016/j.talanta.2006.04.002. Epub 2006 May 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

比较个体和集成机器学习模型对未经处理和处理酸性矿山排水中硫酸盐水平的预测。

Comparison of individual and ensemble machine learning models for prediction of sulphate levels in untreated and treated Acid Mine Drainage.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献