Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, NJ 08544, United States.
School of Engineering and Applied Science, Princeton University, Princeton, NJ 08544, United States.
Water Res. 2022 Jul 15;220:118714. doi: 10.1016/j.watres.2022.118714. Epub 2022 Jun 4.
Many wastewater utilities have discharge permits directly tied with the receiving river flow, so it is critical to have accurate prediction of the hydraulic throughput to ensure safe operation and environment protection. Current empirical knowledge-based operation faces many challenges, so in this study we developed and assessed daily-adaptive, probabilistic soft sensor prediction models to forecast the next month's average receiving river flowrate and guide the utility operations. By comparing 11 machine-learning methods, extra trees regression exhibits desired deterministic prediction accuracy at day 0 (overall accuracy index: 3.9 × 10 1/cms) (cms: cubic meter per second), which also increases steadily over the course of the month (e.g., MAPE and RMSE decrease from 41.46% and 23.31 cms to 3.31% and 2.81 cms, respectively). The overall classification accuracy of three river flow classes reaches 0.79 at the beginning and increases to about 0.97 over the course of the predicted month. To manage the uncertainty caused by potential false negative classification as overestimations, a probabilistic assessment on the predictions based on 95% lower PI is developed and successfully reduces the false negative classification from 17% to nearly zero with a slight sacrifice of overall classification accuracy.
许多废水处理厂的排放许可证直接与接收河流流量挂钩,因此准确预测水力吞吐量对于确保安全运行和环境保护至关重要。当前基于经验知识的运行面临着许多挑战,因此在本研究中,我们开发并评估了每日自适应的概率软传感器预测模型,以预测下一个月的平均接收河流量,并指导公用事业的运营。通过比较 11 种机器学习方法,决策树回归在第 0 天(整体精度指标:3.9×10 1/cms)(cms:立方米每秒)表现出理想的确定性预测精度,并且在整个月内稳步提高(例如,平均绝对百分比误差和均方根误差分别从 41.46%和 23.31cms 降低到 3.31%和 2.81cms)。在预测月开始时,三种河流水流等级的整体分类准确率达到 0.79,并在预测月内增加到约 0.97。为了管理潜在的误报高估引起的不确定性,基于 95%的下 PI 对预测进行了概率评估,并成功地将误报分类从 17%降低到几乎为零,而整体分类准确率略有下降。