Randive Pallavi, Bhagat Madhuri S, Bhorkar Mangesh P, Bhagat Rajesh M, Vinchurkar Shilpa M, Shelare Sagar, Sharma Shubham, Beemkumar N, Hemalatha S, Kumar Parveen, Kedia Ankit, El Sayed Massoud Ehab, Gupta Deepak, Lozanovic Jasmina
Civil Engineering Department, KDK College of Engineering, Nagpur, Maharashtra, 440024, India.
Civil Engineering Department, Yeshwantrao Chavan College of Engineering (YCCE), Nagpur, Nagpur, Maharashtra, 441110, India.
Sci Rep. 2025 May 8;15(1):16096. doi: 10.1038/s41598-025-96750-9.
The efficiency optimization methods for natural coagulants are often restricted due to non-scientific trial-and-error approaches. They are inaccurate in predicting the complex interactions of jet mixing parameters, coagulant dosage, and environmental conditions. To overcome these obstacles, this research paper proposes advanced hybrid models in machine learning to enhance flocculation efficiency. We use the CatBoost model with the NTK to learn the intricate nonlinear interactions among jet velocity, mixing time, coagulant dose, pH, and turbidity. CatBoost is effective for dealing with categorical data like diverse coagulants. Meanwhile, NTK boosts the model's generalization capability, especially when the sample size becomes small or experimental datasets are applied. Lastly, SOMs and MARS are used to identify pattern recognitions in tracing the crucial interaction among mixing parameters. Reinforcement learning techniques-that include DDPG and SAC for dynamic optimization of jet velocity, mixing time, and coagulant dosage-optimize the model in real time. Utilizing NAS and Hyperband to automate model tuning, the timestamp was reduced by 40%. The proposed models heavily improve the efficiency of the flocculation process by 20-25% and allow for a good predictive accuracy of 95-97%. Paramount, however, is that the model has interpretability properties assured by SHAP and counterfactual explanations, which would give actionable insights into the most influencing factors on the efficiency of flocculation. This work represents a substantial advancement for the discipline since it introduces robust, interpretable, and real-time optimization methods to offer a practical tool through which improvement of water treatment processes would be made both sustainable and efficient.
由于采用非科学的试错方法,天然絮凝剂的效率优化方法常常受到限制。这些方法在预测射流混合参数、絮凝剂用量和环境条件的复杂相互作用时不够准确。为了克服这些障碍,本研究论文提出了机器学习中的先进混合模型,以提高絮凝效率。我们使用带有神经切线核(NTK)的CatBoost模型来学习射流速度、混合时间、絮凝剂剂量、pH值和浊度之间复杂的非线性相互作用。CatBoost在处理如不同絮凝剂等分类数据方面很有效。同时,NTK提升了模型的泛化能力,特别是当样本量变小或应用实验数据集时。最后,自组织映射(SOMs)和多元自适应回归样条(MARS)用于识别在追踪混合参数之间关键相互作用时的模式识别。强化学习技术——包括用于射流速度、混合时间和絮凝剂剂量动态优化的深度确定性策略梯度(DDPG)和软演员评论家(SAC)——实时优化模型。利用神经架构搜索(NAS)和超参数优化方法Hyperband实现模型调优自动化,时间戳减少了40%。所提出的模型将絮凝过程的效率大幅提高了20%-25%,并具有95%-97%的良好预测准确率。然而,至关重要的是,该模型具有由SHAP和反事实解释保证的可解释性,这将为影响絮凝效率的最主要因素提供可操作的见解。这项工作代表了该学科的重大进步,因为它引入了强大、可解释和实时的优化方法,提供了一个实用工具,通过该工具可使水处理过程的改进既可持续又高效。