Dong Qiyu, Bai Shunwen, Wang Zhen, Zhao Xinyue, Yang Shanshan, Ren Nanqi
State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, 150090, Harbin, China.
State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, 150090, Harbin, China.
J Environ Manage. 2023 Nov 15;346:118961. doi: 10.1016/j.jenvman.2023.118961. Epub 2023 Sep 13.
The design of constructed wetlands (CWs) is critical to ensure effective wastewater treatment. However, limited availability of reliable data can hamper the accuracy of CW effluent predictions, thus increasing design costs and time. In this study, a novel effluent prediction framework for CWs is proposed, utilizing data dimensionality reduction and virtual sample generation. By using four the machine learning algorithms (Cubist, random forest, support vector regression, and extreme learning machine), important features of CW design are identified and used to build prediction models. The extreme learning machine algorithm achieved the highest determination coefficient and lowest error, identifying it as the most suitable algorithm for effluent prediction. A multi-distribution mega-trend-diffusion algorithm with particle swarm optimization was employed to generate virtual samples. These virtual samples were then combined with real samples to retrain the prediction model and verify the optimization effect. Comparative analysis demonstrated that the integration of virtual samples significantly improved the prediction accuracy for ammonium and chemical oxygen demand. The root mean square error decreased by averages of 60.5% and 42.1%, respectively, and the mean absolute percentage error by averages of 21.5% and 23.8%, respectively. Finally, a CW design process is proposed based on prediction models and virtual samples. This integrated forward prediction and reverse design tool can efficiently support CW design when sample sizes are limited, ultimately leading to more accurate and cost-effective design solutions.
人工湿地(CWs)的设计对于确保有效的废水处理至关重要。然而,可靠数据的有限可用性可能会妨碍人工湿地出水预测的准确性,从而增加设计成本和时间。在本研究中,提出了一种新颖的人工湿地出水预测框架,利用数据降维和虚拟样本生成。通过使用四种机器学习算法(Cubist、随机森林、支持向量回归和极限学习机),识别出人工湿地设计的重要特征并用于构建预测模型。极限学习机算法获得了最高的决定系数和最低的误差,被确定为最适合出水预测的算法。采用带有粒子群优化的多分布巨趋势扩散算法生成虚拟样本。然后将这些虚拟样本与真实样本相结合,对预测模型进行重新训练并验证优化效果。对比分析表明,虚拟样本的整合显著提高了铵和化学需氧量的预测准确性。均方根误差分别平均下降了60.5%和42.1%,平均绝对百分比误差分别平均下降了21.5%和23.8%。最后,基于预测模型和虚拟样本提出了一种人工湿地设计流程。这种集成的正向预测和反向设计工具在样本量有限时可以有效地支持人工湿地设计,最终带来更准确且具有成本效益的设计方案。