• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多小才算足够大?大数据驱动的机器学习对一座全规模污水处理厂的预测

How small is big enough? Big data-driven machine learning predictions for a full-scale wastewater treatment plant.

作者信息

Ma Yanyan, Qiao Yiheng, Chen Mengxue, Rui Dongni, Zhang Xuxiang, Liu Weijing, Ye Lin

机构信息

State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China.

Nanjing Gaoke Environmental Technology Co., Ltd., Nanjing 210038, China.

出版信息

Water Res. 2025 Apr 15;274:123041. doi: 10.1016/j.watres.2024.123041. Epub 2024 Dec 25.

DOI:10.1016/j.watres.2024.123041
PMID:39740325
Abstract

Wastewater treatment plants (WWTPs) generate vast amounts of water quality, operational, and biological data. The potential of these big data, particularly through machine learning (ML), to improve WWTP management is increasingly recognized. However, the costs associated with data collection and processing can rise sharply as datasets grow larger, and research on determining the optimal data volume for effective ML application remains limited. In this study, we comprehensively analyzed water quality, operational, and biological data collected from a full-scale WWTP over 970 days. Our results demonstrate that ML models can predict not only operational and water quality parameters (concentrations of dissolved oxygen and effluent chemical oxygen demand) but also the abundances of functional bacteria. Notably, we discovered that increasing data volume does not always improve model performance, and that data collection intervals do not need to be excessively small, as moderate intervals can still yield reliable predictions. These findings suggest that excessively large datasets may not be necessary for effective ML predictions in WWTPs. Overall, this study underscores the importance of optimizing dataset size to balance computation efficiency and prediction accuracy, providing valuable insights into data management strategies that can enhance the operational efficiency and sustainability of WWTPs.

摘要

污水处理厂会产生大量的水质、运行和生物数据。人们越来越认识到这些大数据,特别是通过机器学习(ML)来改善污水处理厂管理的潜力。然而,随着数据集规模的增大,与数据收集和处理相关的成本可能会急剧上升,而关于确定有效应用机器学习的最佳数据量的研究仍然有限。在本研究中,我们全面分析了从一座全尺寸污水处理厂在970天内收集的水质、运行和生物数据。我们的结果表明,机器学习模型不仅可以预测运行和水质参数(溶解氧浓度和出水化学需氧量),还可以预测功能细菌的丰度。值得注意的是,我们发现增加数据量并不总是能提高模型性能,而且数据收集间隔不需要过小,因为适度的间隔仍然可以产生可靠的预测。这些发现表明,对于污水处理厂中有效的机器学习预测而言,可能并不需要过大的数据集。总体而言,本研究强调了优化数据集大小以平衡计算效率和预测准确性的重要性,为可提高污水处理厂运行效率和可持续性的数据管理策略提供了有价值的见解。

相似文献

1
How small is big enough? Big data-driven machine learning predictions for a full-scale wastewater treatment plant.多小才算足够大?大数据驱动的机器学习对一座全规模污水处理厂的预测
Water Res. 2025 Apr 15;274:123041. doi: 10.1016/j.watres.2024.123041. Epub 2024 Dec 25.
2
Exploring potential machine learning application based on big data for prediction of wastewater quality from different full-scale wastewater treatment plants.基于大数据的机器学习在预测不同规模污水处理厂废水水质中的应用潜力研究。
Sci Total Environ. 2022 Aug 1;832:154930. doi: 10.1016/j.scitotenv.2022.154930. Epub 2022 Apr 4.
3
Predicting effluent quality parameters for wastewater treatment plant: A machine learning-based methodology.预测污水处理厂的出水水质参数:一种基于机器学习的方法。
Chemosphere. 2024 Mar;352:141472. doi: 10.1016/j.chemosphere.2024.141472. Epub 2024 Feb 19.
4
Enhancing effluent quality prediction in wastewater treatment plants through the integration of factor analysis and machine learning.通过整合因子分析和机器学习提高污水处理厂的出水水质预测能力。
Bioresour Technol. 2024 Feb;393:130008. doi: 10.1016/j.biortech.2023.130008. Epub 2023 Nov 18.
5
Quality evaluation parameter and classification model for effluents of wastewater treatment plant based on machine learning.基于机器学习的污水处理厂出水水质评价参数及分类模型
Water Res. 2025 Jan 1;268(Pt B):122696. doi: 10.1016/j.watres.2024.122696. Epub 2024 Oct 24.
6
Learning a neural network-based soft sensor with double-errors parallel optimization towards effluent variable prediction in wastewater treatment plants.基于神经网络的软传感器学习,采用双误差并行优化方法,用于预测污水处理厂的出水变量。
J Environ Manage. 2024 Aug;366:121907. doi: 10.1016/j.jenvman.2024.121907. Epub 2024 Jul 24.
7
Attention-based deep learning models for predicting anomalous shock of wastewater treatment plants.基于注意力机制的深度学习模型用于预测污水处理厂的异常冲击。
Water Res. 2025 May 1;275:123192. doi: 10.1016/j.watres.2025.123192. Epub 2025 Jan 23.
8
Model construction and application for effluent prediction in wastewater treatment plant: Data processing method optimization and process parameters integration.模型构建与应用:污水处理厂出水预测. 数据处理方法优化与工艺参数集成。
J Environ Manage. 2022 Jan 15;302(Pt A):114020. doi: 10.1016/j.jenvman.2021.114020. Epub 2021 Oct 28.
9
Bayesian Optimization-Enhanced Reinforcement learning for Self-adaptive and multi-objective control of wastewater treatment.用于污水处理自适应多目标控制的贝叶斯优化增强强化学习
Bioresour Technol. 2025 Apr;421:132210. doi: 10.1016/j.biortech.2025.132210. Epub 2025 Feb 9.
10
Machine learning facilitated the conceptual design of an alum dosing system for phosphorus removal in a wastewater treatment plant.机器学习助力了某污水处理厂用于除磷的明矾投加系统的概念设计。
Chemosphere. 2024 Mar;351:141154. doi: 10.1016/j.chemosphere.2024.141154. Epub 2024 Jan 9.