Key Lab of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China; Sino-Danish Center for Education and Research, University of Chinese Academy of Sciences, Beijing, China.
Key Lab of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, 18 Shuangqing Road, Haidian District, Beijing, 100085, China; School of Civil & Environmental Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen, 518055, PR China; State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin, 150001, China.
Chemosphere. 2021 Sep;279:130498. doi: 10.1016/j.chemosphere.2021.130498. Epub 2021 Apr 14.
Wastewater treatment plants (WWTPs) are designed to eliminate pollutants and alleviate environmental pollution resulting from human activities. However, the construction and operation of WWTPs consume resources, emit greenhouse gases (GHGs) and produce residual sludge, thus require further optimization. WWTPs are complex to control and optimize because of high non-linearity and variation. This study used a novel technique, multi-agent deep reinforcement learning (MADRL), to simultaneously optimize dissolved oxygen (DO) and chemical dosage in a WWTP. The reward function was specially designed from life cycle perspective to achieve sustainable optimization. Five scenarios were considered: baseline, three different effluent quality and cost-oriented scenarios. The result shows that optimization based on LCA has lower environmental impacts compared to baseline scenario, as cost, energy consumption and greenhouse gas emissions reduce to 0.890 CNY/m-ww, 0.530 kWh/m-ww, 2.491 kg CO-eq/m-ww respectively. The cost-oriented control strategy exhibits comparable overall performance to the LCA-driven strategy since it sacrifices environmental benefits but has lower cost as 0.873 CNY/m-ww. It is worth mentioning that the retrofitting of WWTPs based on resources should be implemented with the consideration of impact transfer. Specifically, LCA-SW scenario decreases 10 kg PO-eq in eutrophication potential compared to the baseline within 10 days, while significantly increases other indicators. The major contributors of each indicator are identified for future study and improvement. Last, the authors discussed that novel dynamic control strategies required advanced sensors or a large amount of data, so the selection of control strategies should also consider economic and ecological conditions. In a nutshell, there are still limitations of this work and future studies are required.
污水处理厂(WWTP)旨在消除污染物,减轻人类活动造成的环境污染。然而,WWTP 的建设和运行消耗资源,排放温室气体(GHG)并产生剩余污泥,因此需要进一步优化。由于高度的非线性和变异性,WWTP 的控制和优化较为复杂。本研究采用一种新的技术,即多智能体深度强化学习(MADRL),同时优化 WWTP 中的溶解氧(DO)和化学剂量。奖励函数是从生命周期的角度专门设计的,以实现可持续优化。考虑了五个场景:基线、三个不同的出水质量和成本导向场景。结果表明,基于生命周期评价的优化与基线情景相比具有更低的环境影响,因为成本、能源消耗和温室气体排放量分别降低到 0.890 CNY/m-ww、0.530 kWh/m-ww 和 2.491 kg CO-eq/m-ww。基于成本的控制策略具有与 LCA 驱动策略相当的整体性能,因为它牺牲了环境效益,但成本更低,为 0.873 CNY/m-ww。值得注意的是,基于资源的 WWTP 改造应考虑到影响转移。具体来说,与基线相比,LCA-SW 情景在 10 天内将富营养化潜力中的 10kg PO-eq 减少了 10kg,但其他指标显著增加。确定了每个指标的主要贡献者,以供未来研究和改进。最后,作者讨论了新型动态控制策略需要先进的传感器或大量的数据,因此控制策略的选择也应考虑经济和生态条件。简而言之,这项工作仍然存在局限性,需要进一步研究。