Görenekli Kasim, Gülbağ Ali
Faculty of Computer and Information Sciences, Sakarya University, Sakarya 54050, Turkey.
Sensors (Basel). 2024 Sep 9;24(17):5846. doi: 10.3390/s24175846.
This study presents a comparative analysis of various Machine Learning (ML) techniques for predicting water consumption using a comprehensive dataset from Kocaeli Province, Turkey. Accurate prediction of water consumption is crucial for effective water resource management and planning, especially considering the significant impact of the COVID-19 pandemic on water usage patterns. A total of four ML models, Artificial Neural Networks (ANN), Random Forest (RF), Support Vector Machines (SVM), and Gradient Boosting Machines (GBM), were evaluated. Additionally, optimization techniques such as Particle Swarm Optimization (PSO) and the Second-Order Optimization (SOO) Levenberg-Marquardt (LM) algorithm were employed to enhance the performance of the ML models. These models incorporate historical data from previous months to enhance model accuracy and generalizability, allowing for robust predictions that account for both short-term fluctuations and long-term trends. The performance of each model was assessed using cross-validation. The R and correlation values obtained in this study for the best-performing models are highlighted in the results section. For instance, the GBM model achieved an R value of 0.881, indicating a strong capability in capturing the underlying patterns in the data. This study is one of the first to conduct a comprehensive analysis of water consumption prediction using machine learning algorithms on a large-scale dataset of 5000 subscribers, including the unique conditions imposed by the COVID-19 pandemic. The results highlight the strengths and limitations of each technique, providing insights into their applicability for water consumption prediction. This study aims to enhance the understanding of ML applications in water management and offers practical recommendations for future research and implementation.
本研究使用来自土耳其科贾埃利省的综合数据集,对用于预测用水量的各种机器学习(ML)技术进行了比较分析。准确预测用水量对于有效的水资源管理和规划至关重要,尤其是考虑到新冠疫情对用水模式的重大影响。总共评估了四种ML模型,即人工神经网络(ANN)、随机森林(RF)、支持向量机(SVM)和梯度提升机(GBM)。此外,还采用了粒子群优化(PSO)和二阶优化(SOO)Levenberg-Marquardt(LM)算法等优化技术来提高ML模型的性能。这些模型纳入了前几个月的历史数据,以提高模型的准确性和通用性,从而能够进行稳健的预测,兼顾短期波动和长期趋势。使用交叉验证评估每个模型的性能。本研究中表现最佳的模型所获得的R值和相关值在结果部分中突出显示。例如,GBM模型的R值达到0.881,表明其在捕捉数据潜在模式方面具有很强的能力。本研究是首批使用机器学习算法对5000个用户的大规模数据集进行用水量预测综合分析的研究之一,其中包括新冠疫情带来的独特情况。结果突出了每种技术的优势和局限性,为它们在用水量预测中的适用性提供了见解。本研究旨在增进对ML在水资源管理中应用的理解,并为未来的研究和实施提供实用建议。