Bellamoli Francesca, Vian Marco, Di Iorio Mattia, Melgani Farid
Department of Information Engineering and Computer Science, University of Trento, via Sommarive 9, Trento 38123, Italy; ETC Sustainable Solutions Srl, via dei Palustei 16, Trento 38121, Italy E-mail:
ETC Sustainable Solutions Srl, via dei Palustei 16, Trento 38121, Italy.
Water Sci Technol. 2024 Dec;90(11):3123-3138. doi: 10.2166/wst.2024.387. Epub 2024 Nov 27.
The increasing use of intermittent aeration controllers in wastewater treatment plants (WWTPs) aims to reduce aeration costs via continuous ammonia and oxygen measurements but faces challenges in detecting sensor and process anomalies. Applying machine learning to this unbalanced, multivariate, multiclass classification challenge requires much data, difficult to obtain from a new plant. This study develops a machine learning algorithm to identify anomalies in intermittent aeration WWTPs, adaptable to new plants with limited data. Utilizing active learning, the method iteratively selects samples from the target domain to fine-tune a gradient-boosting model initially trained on data from 17 plants. Three sampling strategies were tested, with low probability and high entropy sampling proving effective in early adaptation, achieving an F2-score close to the optimal with minimal sample use. The objective is to deploy these models as decision support systems for WWTP management, providing a strategy for efficient model adaptation to new plants, and optimizing labeling efforts
污水处理厂(WWTPs)中间歇曝气控制器的使用日益增加,旨在通过连续测量氨和氧气来降低曝气成本,但在检测传感器和过程异常方面面临挑战。将机器学习应用于这种不平衡、多变量、多类别的分类挑战需要大量数据,而从新工厂很难获得这些数据。本研究开发了一种机器学习算法,以识别间歇曝气污水处理厂中的异常情况,该算法适用于数据有限的新工厂。利用主动学习,该方法从目标域中迭代选择样本,以微调最初在来自17个工厂的数据上训练的梯度提升模型。测试了三种采样策略,低概率和高熵采样在早期适应中被证明是有效的,使用最少的样本就能达到接近最优的F2分数。目标是将这些模型部署为污水处理厂管理的决策支持系统,提供一种使模型有效适应新工厂并优化标记工作的策略。