• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于自适应综合采样算法和机器学习的休闲水质预测模型。

A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning.

机构信息

School of Environment, Faculty of Science, University of Auckland, New Zealand.

School of Environment, Faculty of Science, University of Auckland, New Zealand.

出版信息

Water Res. 2020 Jun 15;177:115788. doi: 10.1016/j.watres.2020.115788. Epub 2020 Apr 13.

DOI:10.1016/j.watres.2020.115788
PMID:32330740
Abstract

Predicting recreational water quality is one of the most difficult tasks in water management with major implications for humans and society. Many data-driven models have been used to predict water quality indicators to allow a real time assessment of public health risk. This assessment is most commonly based on Faecal Indicator Bacteria (FIB), with the value of FIB compared with thresholds published in guidelines. However, FIB values usually tend to be unbalanced within water quality datasets, with small proportions of data exceeding guideline thresholds and far larger numbers that do not. This can be a limiting factor in the uptake of model predictions since, even if the overall accuracy is high, the sensitivity of the predictions can be low. To address this issue, this paper proposes an adaptive synthetic sampling algorithm (ADASYN) to generate synthetic above-threshold FIB instances and test the validity of the approach for the prediction of recreational water quality. The models in this paper are based on four machine learning techniques: k-mean nearest neighbour, boosting decision tree, support vector machine, and multi-layer perceptron artificial neural network and are applied to five different locations in Auckland, New Zealand. Aside from support vector machine, all models provide favourable predictions with relatively high sensitivity (around 75%) and overall accuracy (over 90%), indicating that both the compliant and exceedance conditions can be effectively predicted through the use of more sophisticated model training which involves artificial data. Considering the model accuracy and stability, boosting decision trees (BDT) and multi-layer perceptron artificial neural (MLP-ANN) network are the best two models and the multi-layer perceptron is the most efficient with the shortest computation time.

摘要

预测娱乐用水水质是水管理中最具挑战性的任务之一,对人类和社会都有重大影响。许多数据驱动的模型已被用于预测水质指标,以实时评估公共健康风险。这种评估最常基于粪便指示菌(FIB),将 FIB 值与指南中公布的阈值进行比较。然而,FIB 值在水质数据集中通常倾向于不平衡,只有一小部分数据超过了指南的阈值,而远远超过了更大数量的没有超过阈值的数据。这可能是模型预测采用的一个限制因素,因为即使整体准确性很高,预测的敏感性也可能很低。为了解决这个问题,本文提出了一种自适应合成采样算法(ADASYN)来生成合成的超过阈值的 FIB 实例,并测试该方法在预测娱乐用水水质方面的有效性。本文中的模型基于四种机器学习技术:k-最近邻均值、提升决策树、支持向量机和多层感知机人工神经网络,并应用于新西兰奥克兰的五个不同地点。除了支持向量机,所有模型都提供了有利的预测,具有相对较高的敏感性(约 75%)和整体准确性(超过 90%),这表明通过使用更复杂的模型训练,包括人工数据,既可以有效地预测符合规定的条件,也可以预测超标条件。考虑到模型的准确性和稳定性,提升决策树(BDT)和多层感知机人工神经网络(MLP-ANN)网络是两个最佳模型,而多层感知机的效率最高,计算时间最短。

相似文献

1
A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning.基于自适应综合采样算法和机器学习的休闲水质预测模型。
Water Res. 2020 Jun 15;177:115788. doi: 10.1016/j.watres.2020.115788. Epub 2020 Apr 13.
2
Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.优化神经网络在医学数据集上的应用:以新生儿呼吸暂停预测为例的研究
Artif Intell Med. 2019 Jul;98:59-76. doi: 10.1016/j.artmed.2019.07.008. Epub 2019 Jul 25.
3
Class-imbalanced crash prediction based on real-time traffic and weather data: A driving simulator study.基于实时交通和天气数据的不平衡碰撞预测:驾驶模拟器研究。
Traffic Inj Prev. 2020;21(3):201-208. doi: 10.1080/15389588.2020.1723794. Epub 2020 Mar 3.
4
Seminal quality prediction using data mining methods.使用数据挖掘方法进行精液质量预测。
Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.
5
Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method.利用自适应合成采样方法提高有害藻华预警机器学习模型的性能。
Water Res. 2021 Dec 1;207:117821. doi: 10.1016/j.watres.2021.117821. Epub 2021 Oct 30.
6
Efficient Prediction of Missed Clinical Appointment Using Machine Learning.利用机器学习高效预测临床预约失约情况。
Comput Math Methods Med. 2021 Oct 22;2021:2376391. doi: 10.1155/2021/2376391. eCollection 2021.
7
Application of supervised machine learning algorithms in the classification of sagittal gait patterns of cerebral palsy children with spastic diplegia.监督机器学习算法在痉挛性双瘫脑瘫儿童矢状面步态模式分类中的应用。
Comput Biol Med. 2019 Mar;106:33-39. doi: 10.1016/j.compbiomed.2019.01.009. Epub 2019 Jan 16.
8
Robust clustering-based hybrid technique enabling reliable reservoir water quality prediction with uncertainty quantification and spatial analysis.基于鲁棒聚类的混合技术,可实现具有不确定性量化和空间分析的可靠水库水质预测。
J Environ Manage. 2024 Jun;362:121259. doi: 10.1016/j.jenvman.2024.121259. Epub 2024 Jun 3.
9
Using machine learning models to predict the effects of seasonal fluxes on Plesiomonas shigelloides population density.使用机器学习模型预测季节性通量对类志贺邻单胞菌种群密度的影响。
Environ Pollut. 2023 Jan 15;317:120734. doi: 10.1016/j.envpol.2022.120734. Epub 2022 Nov 28.
10
Personal Health Information Inference Using Machine Learning on RNA Expression Data from Patients With Cancer: Algorithm Validation Study.利用癌症患者 RNA 表达数据进行机器学习的个人健康信息推断:算法验证研究。
J Med Internet Res. 2020 Aug 10;22(8):e18387. doi: 10.2196/18387.

引用本文的文献

1
Comparative analysis of machine learning models for detecting water quality anomalies in treatment plants.用于检测污水处理厂水质异常的机器学习模型的比较分析
Sci Rep. 2025 Aug 19;15(1):30453. doi: 10.1038/s41598-025-15517-4.
2
Mapping reservoir water quality from Sentinel-2 satellite data based on a new approach of weighted averaging: Application of Bayesian maximum entropy.基于加权平均新方法利用哨兵-2卫星数据绘制水库水质图:贝叶斯最大熵的应用
Sci Rep. 2024 Jul 16;14(1):16438. doi: 10.1038/s41598-024-66699-2.
3
Monitoring and warning for ammonia nitrogen pollution of urban river based on neural network algorithms.
基于神经网络算法的城市河流氨氮污染监测与预警
Anal Sci. 2024 Oct;40(10):1867-1879. doi: 10.1007/s44211-024-00622-7. Epub 2024 Jun 23.
4
Advances in machine learning and IoT for water quality monitoring: A comprehensive review.用于水质监测的机器学习与物联网进展:全面综述
Heliyon. 2024 Mar 13;10(6):e27920. doi: 10.1016/j.heliyon.2024.e27920. eCollection 2024 Mar 30.
5
Machine Learning-Based Early Warning Level Prediction for Cyanobacterial Blooms Using Environmental Variable Selection and Data Resampling.基于机器学习的蓝藻水华早期预警水平预测:利用环境变量选择和数据重采样
Toxics. 2023 Nov 23;11(12):955. doi: 10.3390/toxics11120955.
6
Cyanobacterial Algal Bloom Monitoring: Molecular Methods and Technologies for Freshwater Ecosystems.蓝藻水华监测:淡水生态系统的分子方法与技术
Microorganisms. 2023 Mar 27;11(4):851. doi: 10.3390/microorganisms11040851.
7
Machine Learning to Dynamically Predict In-Hospital Venous Thromboembolism After Inguinal Hernia Surgery: Results From the CHAT-1 Study.机器学习预测腹股沟疝手术后院内静脉血栓栓塞症:CHAT-1 研究结果。
Clin Appl Thromb Hemost. 2023 Jan-Dec;29:10760296231171082. doi: 10.1177/10760296231171082.
8
An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment.一种基于多元分布和高斯Copula函数,在数据稀缺环境中使用深度神经网络模型预测地下水质量的方法。
MethodsX. 2023 Feb 2;10:102034. doi: 10.1016/j.mex.2023.102034. eCollection 2023.
9
Groundwater Quality: The Application of Artificial Intelligence.地下水质量:人工智能的应用。
J Environ Public Health. 2022 Aug 24;2022:8425798. doi: 10.1155/2022/8425798. eCollection 2022.
10
The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.基于 kmeans 聚类离散化和 Apriori 算法的深圳供水协会分析方法。
PLoS One. 2021 Aug 5;16(8):e0255684. doi: 10.1371/journal.pone.0255684. eCollection 2021.